Compositions and methods for modifying a target nucleic acid

ABSTRACT

The present disclosure provides a system for editing genomic DNA, the system comprising an asymmetric donor DNA template; and methods of editing genomic DNA involving use of an asymmetric donor DNA template. The present disclosure provides a system for editing genomic DNA, the system comprising a Cas9 polypeptide with reduced enzymatic activity; and methods of editing genomic DNA involving use of a Cas9 polypeptide with reduced enzymatic activity.

CROSS-REFERENCE

This application claims the benefit of U.S. Provisional PatentApplication No. 62/262,189, filed Dec. 2, 2015, which application isincorporated herein by reference in its entirety.

INCORPORATION BY REFERENCE OF SEQUENCE LISTING PROVIDED AS A TEXT FILE

A Sequence Listing is provided herewith as a text file,“BERK-305_Seq_List_ST25.txt” created on Dec. 1, 2015 and having a sizeof 7,639 KB. The contents of the text file are incorporated by referenceherein in their entirety.

INTRODUCTION

RNA-mediated adaptive immune systems in bacteria and archaea rely onClustered Regularly Interspaced Short Palindromic Repeat (CRISPR)genomic loci and CRISPR-associated (Cas) proteins that function togetherto provide protection from invading viruses and plasmids. In Type IICRISPR-Cas systems, Cas9 functions as an RNA-guided endonuclease thatuses a dual-guide RNA consisting of crRNA and trans-activating crRNA(tracrRNA) for target recognition and cleavage by a mechanism involvingtwo nuclease active sites that together generate double-stranded DNAbreaks (DSBs), or can individually generate single-stranded DNA breaks(SSBs). The Type II CRISPR endonuclease Cas9 and engineered dual-(dgRNA) or single guide RNA (sgRNA) form a ribonucleoprotein (RNP)complex that can be targeted to a desired DNA sequence. Guided by adual-RNA complex or a chimeric single-guide RNA, Cas9 generatessite-specific DSBs or SSBs within double-stranded DNA (dsDNA) targetnucleic acids, which are repaired either by non-homologous end joining(NHEJ) or homology-directed recombination (HDR).

SUMMARY

The present disclosure provides a system for editing genomic DNA, thesystem comprising an asymmetric donor DNA template; and methods ofediting genomic DNA involving use of an asymmetric donor DNA template.The present disclosure provides a system for editing genomic DNA, thesystem comprising a Cas9 polypeptide with reduced enzymatic activity;and methods of editing genomic DNA involving use of a Cas9 polypeptidewith reduced enzymatic activity.

The present disclosure provides a method of editing genomic DNA of aeukaryotic cell, wherein the genomic DNA comprises a target strand and anon-target strand, the method comprising introducing into the cell: (a)a Cas9 guide RNA, or one or more nucleic acids encoding said Cas9 guideRNA, wherein the Cas9 guide RNA hybridizes to a target sequence of thetarget strand of the genomic DNA; (b) an asymmetric double stranded orsingle stranded donor DNA molecule comprising a 5′ homology arm and a 3′homology arm, wherein the 3′ homology arm is 20 to 50 nucleotides inlength (e.g., 20 nucleotides (nt) to 25 nt, 25 nt to 30 nt, 30 nt to 35nt, 35 nt to 40 nt, 40 nt to 45 nt, or 45 nt to 50 nt in length), isshorter than the 5′ homology arm, and comprises at least 10 consecutivenucleotides of said target sequence; and (c) a Cas9 protein or a nucleicacid encoding said Cas9 protein, wherein (i) the Cas9 protein forms acomplex with the Cas9 guide RNA thereby guiding the Cas9 protein to saidtarget sequence, (ii) the 3′ homology arm of the donor DNA moleculehybridizes to the non-target strand of the genomic DNA, and (iii) anucleotide sequence of the donor DNA molecule is incorporated into thegenomic DNA. In some cases, the Cas9 protein comprises a functional RuvCdomain and cleaves at least the non-target strand of genomic DNA. Insome cases, the Cas9 protein comprises a functional HNH domain andcleaves at least the target strand of genomic DNA. In some cases, thedonor DNA molecule is single stranded. In some cases, the 5′ homologyarm of the donor DNA molecule is 40 to 200 nucleotides in length (e.g.,40 nucleotides (nt) to 50 nt, 50 nt to 75 nt, 75 nt to 100 nt, 100 nt to125 nt, 125 nt to 150 nt, 150 nt to 175 nt, or 175 nt to 200 nt inlength). In some cases, the donor DNA molecule comprises a heterologousnucleotide sequence, between the 5′ and 3′ homology arms, that isincorporated into the genomic DNA. In some cases, the donor DNA moleculecomprises one or more synthetic modifications selected from: a basemodification, a sugar modification, and a backbone modification. In somecases, the donor DNA molecule comprises a phosphorothioate linkage.

The present disclosure provides a system for editing genomic DNA in aeukaryotic cell, the system comprising: (a) a Cas9 guide RNA, or one ormore nucleic acids encoding said Cas9 guide RNA, wherein the Cas9 guideRNA comprises a guide sequence that is complementary to a targetsequence of a target strand of genomic DNA of a eukaryotic cell; and (b)an asymmetric double stranded or single stranded donor DNA moleculecomprising a 5′ homology arm and a 3′ homology arm, wherein the 3′homology arm is 20 to 50 nucleotides in length (e.g., 20 nucleotides(nt) to 25 nt, 25 nt to 30 nt, 30 nt to 35 nt, 35 nt to 40 nt, 40 nt to45 nt, or 45 nt to 50 nt in length), is shorter than the 5′ homologyarm, and comprises at least 10 consecutive nucleotides of said targetsequence. In some cases, the system comprises a Cas9 protein or anucleic acid encoding said Cas9 protein. In some cases, the Cas9 proteincomprises a functional RuvC domain. In some cases, the Cas9 proteincomprises a functional HNH domain. In some cases, the donor DNA moleculeis single stranded. In some cases, the 5′ homology arm of the donor DNAmolecule is 40 to 200 nucleotides in length (e.g., 40 nucleotides (nt)to 50 nt, 50 nt to 75 nt, 75 nt to 100 nt, 100 nt to 125 nt, 125 nt to150 nt, 150 nt to 175 nt, or 175 nt to 200 nt in length). In some cases,the donor DNA molecule comprises nucleotide sequence, between the 5′ and3′ homology arms, that is heterologous to said genomic DNA. In somecases, the donor DNA molecule comprises one or more syntheticmodifications selected from: a base modification, a sugar modification,and a backbone modification. In some cases, the donor DNA moleculecomprises a phosphorothioate linkage.

The present disclosure provides a method of editing a target genomic DNAof a eukaryotic cell, the method comprising introducing into theeukaryotic cell: (a) a dead Cas9 (dCas9) protein, or a nucleic acidencoding said dCas9 protein, wherein the dCas9 protein does not cleavethe target genomic DNA; (b) a Cas9 guide RNA, or one or more nucleicacids encoding said Cas9 guide RNA, wherein the Cas9 guide RNAhybridizes to a target sequence of the target genomic DNA; and (c) acorresponding double stranded or single stranded donor DNA templatemolecule comprising at least 10 consecutive nucleotides of said targetsequence, wherein the dCas9 protein forms a complex with the Cas9 guideRNA thereby guiding the dCas9 protein to said target sequence, andwherein a nucleotide sequence of the donor DNA molecule is incorporatedinto the genomic DNA. In some cases, the donor DNA template molecule isan asymmetric donor DNA molecule comprising a 5′ homology arm and a 3′homology arm, wherein the 3′ homology arm is 20 to 50 nucleotides inlength (e.g., 20 nucleotides (nt) to 25 nt, 25 nt to 30 nt, 30 nt to 35nt, 35 nt to 40 nt, 40 nt to 45 nt, or 45 nt to 50 nt in length), isshorter than the 5′ homology arm, and comprises at least 10 consecutivenucleotides of said target sequence. In some cases, the 5′ homology armis 40 to 200 nucleotides in length (e.g., 40 nucleotides (nt) to 50 nt,50 nt to 75 nt, 75 nt to 100 nt, 100 nt to 125 nt, 125 nt to 150 nt, 150nt to 175 nt, or 175 nt to 200 nt in length). In some cases, the donorDNA template molecule is single stranded. In some cases, the donor DNAtemplate molecule comprises a heterologous nucleotide sequence that isincorporated into the genomic DNA. In some cases, said introducingcomprises introducing into the cell two or more Cas9 guide RNAs, or oneor more nucleic acids encoding said two or more Cas9 guide RNAs, whereinthe two or more Cas9 guide RNAs hybridize to target sequences that donot overlap with one another and are each separated from one another by1-100 nucleotides (e.g., by 1 nucleotide to 10 nucleotides (nt), 10 ntto 20 nt, 20 nt to 30 nt, 30 nt to 40 nt, 40 nt to 50 nt, 50 nt to 60nt, 60 nt to 70 nt, 70 nt to 80 nt, 80 nt to 90 nt, or 90 nt to 100 nt).In some cases, the two or more Cas9 guide RNAs hybridize to targetsequences that overlap with the donor DNA template molecule. In somecases, said introducing comprises introducing into the cell three ormore Cas9 guide RNAs, or one or more nucleic acids encoding said threeor more Cas9 guide RNAs, wherein the three or more Cas9 guide RNAshybridize to target sequences that do not overlap with one another andare each separated from one another by 1-100 nucleotides (e.g., by 1nucleotide to 10 nucleotides (nt), 10 nt to 20 nt, 20 nt to 30 nt, 30 ntto 40 nt, 40 nt to 50 nt, 50 nt to 60 nt, 60 nt to 70 nt, 70 nt to 80nt, 80 nt to 90 nt, or 90 nt to 100 nt). In some cases, the three ormore Cas9 guide RNAs hybridize to target sequences that overlap with thedonor DNA template molecule. In some cases, said introducing comprisesintroducing into the cell four or more Cas9 guide RNAs, or one or morenucleic acids encoding said four or more Cas9 guide RNAs, wherein thefour or more Cas9 guide RNAs hybridize to target sequences that do notoverlap with one another and are each separated from one another by1-100 nucleotides (e.g., by 1 nucleotide to 10 nucleotides (nt), 10 ntto 20 nt, 20 nt to 30 nt, 30 nt to 40 nt, 40 nt to 50 nt, 50 nt to 60nt, 60 nt to 70 nt, 70 nt to 80 nt, 80 nt to 90 nt, or 90 nt to 100 nt).In some cases, the four or more Cas9 guide RNAs hybridize to targetsequences that overlap with the donor DNA template molecule. In somecases, the donor DNA molecule comprises one or more syntheticmodifications selected from: a base modification, a sugar modification,and a backbone modification. In some cases, the donor DNA moleculecomprises a phosphorothioate linkage.

The present disclosure provides a system for editing genomic DNA in aeukaryotic cell, the system comprising: (a) a dead Cas9 (dCas9) protein,or a nucleic acid encoding said dCas9 protein, wherein the dCas9 proteinlacks catalytically active RuvC and HNH domains; (b) a Cas9 guide RNA,or one or more nucleic acids encoding said Cas9 guide RNA, wherein theCas9 guide RNA comprises a guide sequence that is complementary to atarget sequence of a target genomic DNA of a eukaryotic cell; and (c) acorresponding double stranded or single stranded donor DNA templatemolecule comprising at least 10 consecutive nucleotides of said targetsequence. In some cases, the donor DNA template molecule is anasymmetric donor DNA molecule comprising a 5′ homology arm and a 3′homology arm, wherein the 3′ homology arm is 20 to 50 nucleotides inlength (e.g., 20 nucleotides (nt) to 25 nt, 25 nt to 30 nt, 30 nt to 35nt, 35 nt to 40 nt, 40 nt to 45 nt, or 45 nt to 50 nt in length), isshorter than the 5′ homology arm, and comprises the at least 10consecutive nucleotides of said target sequence. In some cases, the 5′homology arm of the donor DNA molecule is 40 to 200 nucleotides inlength (e.g., 40 nucleotides (nt) to 50 nt, 50 nt to 75 nt, 75 nt to 100nt, 100 nt to 125 nt, 125 nt to 150 nt, 150 nt to 175 nt, or 175 nt to200 nt in length). In some cases, the donor DNA template molecule issingle stranded. In some cases, the donor DNA template moleculecomprises a nucleotide sequence that is a heterologous to said targetgenomic DNA. In some cases, the system comprises two or more Cas9 guideRNAs, or one or more nucleic acids encoding said two or more Cas9 guideRNAs, wherein the guide sequences of the two or more Cas9 guide RNAs arecomplementary to target sequences that do not overlap with one anotherand are each separated from one another by 1-100 nucleotides (e.g., by 1nucleotide to 10 nucleotides (nt), 10 nt to 20 nt, 20 nt to 30 nt, 30 ntto 40 nt, 40 nt to 50 nt, 50 nt to 60 nt, 60 nt to 70 nt, 70 nt to 80nt, 80 nt to 90 nt, or 90 nt to 100 nt). In some cases, the two or moreCas9 guide RNAs are complementary to target sequences that overlap withthe donor DNA template molecule. In some cases, the system comprisesthree or more Cas9 guide RNAs, or one or more nucleic acids encodingsaid three or more Cas9 guide RNAs, wherein the guide sequences of thethree or more Cas9 guide RNAs are complementary to target sequences thatdo not overlap with one another and are each separated from one anotherby 1-100 nucleotides (e.g., by 1 nucleotide to 10 nucleotides (nt), 10nt to 20 nt, 20 nt to 30 nt, 30 nt to 40 nt, 40 nt to 50 nt, 50 nt to 60nt, 60 nt to 70 nt, 70 nt to 80 nt, 80 nt to 90 nt, or 90 nt to 100 nt).In some cases, the three or more Cas9 guide RNAs are complementary totarget sequences that overlap with the donor DNA template molecule. Insome cases, the system comprises four or more Cas9 guide RNAs, or one ormore nucleic acids encoding said four or more Cas9 guide RNAs, whereinthe guide sequences of the four or more Cas9 guide RNAs arecomplementary to target sequences that do not overlap with one anotherand are each separated from one another by 1-100 nucleotides (e.g., by 1nucleotide to 10 nucleotides (nt), 10 nt to 20 nt, 20 nt to 30 nt, 30 ntto 40 nt, 40 nt to 50 nt, 50 nt to 60 nt, 60 nt to 70 nt, 70 nt to 80nt, 80 nt to 90 nt, or 90 nt to 100 nt). In some cases, the four or moreCas9 guide RNAs are complementary to target sequences that overlap withthe donor DNA template molecule. In some cases, the donor DNA moleculecomprises one or more synthetic modifications selected from: a basemodification, a sugar modification, and a backbone modification. In somecases, the donor DNA molecule comprises a phosphorothioate linkage. Insome cases, the system comprises a eukaryotic cell comprising saidtarget genomic DNA.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A-1B depicts association and dissociation of Cas9 from substrateDNA.

FIG. 2A-2E depicts data showing that complementary DNA anneals to thenon-target strand of the RNP-dsDNA complex.

FIG. 3A-3E depicts that delivery of ssDNA donors complementary to thenon-target strand drives efficient HDR using Cas9, nickases and dCas9.

FIG. 4 depicts a schematic of Cas9's interaction with substrate DNA.

FIG. 5A-5E depicts association and dissociation of Cas9 and Cas9variants from substrate DNA.

FIG. 6 depicts a schematic of reagents and experimental design forvarious EMSA experiments.

FIG. 7A-7C depicts removal of non-target strand by challenge DNA.

FIG. 8 depicts a model for challenge-mediated non-target strand removalactivity.

FIG. 9 depicts that the non-target strand is available for enzymaticmodification in cells.

FIG. 10A-10B depicts flow cytometry data corresponding to data in FIG.3.

FIG. 11 depicts the strand-bias for optimized donor DNA.

FIG. 12A-12B depicts various conditions affecting the cutting of loci.

FIG. 13 depicts structural data that is consistent with the asymmetricrelease of substrate by Cas9.

FIG. 14A-14B depict schematically a target nucleic acid, a guide RNA,and a Cas9 protein (FIG. 14A) and binding of an asymmetric donor DNA tothe non-target strand of a target nucleic acid (FIG. 14B).

FIG. 15A-15B depict schematically the interaction of a Cas9 polypeptidewith reduced enzymatic activity (“dead Cas9” or “dCas9”) with a guideRNA and a target nucleic acid.

FIG. 16A-16C schematically depict two different types of Cas9 guideRNAs. (FIG. 16A) Dual guide RNA associated with a Cas9 protein and witha target nucleic acid. (FIG. 16B) Single guide RNA associated with aCas9 protein and with a target nucleic acid. (FIG. 16C) Schematic of onepossible single guide RNA. The depicted guide RNA is a single guide RNAwith a targeter covalently linked to an activator via 4 linkernucleotides. The nucleotides are 5′ to 3′ from left to right. SEQ ID NO:1089.

FIG. 17 presents the sequences of the sgRNA templates, DNA substrates,and donor DNAs used in the examples. Top to bottom: SEQ ID NOs:1090-1133.

FIG. 18 presents the sequences of the expressed Cas9 clones used in theexamples. Top to bottom: SEQ ID NOs: 1134-1140.

DEFINITIONS

The terms “polynucleotide” and “nucleic acid,” used interchangeablyherein, refer to a polymeric form of nucleotides of any length, eitherribonucleotides or deoxyribonucleotides. Thus, this term includes, butis not limited to, single-, double-, or multi-stranded DNA or RNA,genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising purine andpyrimidine bases or other natural, chemically or biochemically modified,non-natural, or derivatized nucleotide bases.

By “hybridizable” or “complementary” or “substantially complementary” itis meant that a nucleic acid (e.g. RNA, DNA) comprises a sequence ofnucleotides that enables it to non-covalently bind, i.e. formWatson-Crick base pairs and/or G/U base pairs, “anneal”, or “hybridize,”to another nucleic acid in a sequence-specific, antiparallel, manner(i.e., a nucleic acid specifically binds to a complementary nucleicacid) under the appropriate in vitro and/or in vivo conditions oftemperature and solution ionic strength. Standard Watson-Crickbase-pairing includes: adenine (A) pairing with thymidine (T), adenine(A) pairing with uracil (U), and guanine (G) pairing with cytosine (C)[DNA, RNA]. In addition, for hybridization between two RNA molecules(e.g., dsRNA), and for hybridization of a DNA molecule with an RNAmolecule (e.g., when a DNA target nucleic acid base pairs with an RNAguide nucleic acid, etc.): guanine (G) can also base pair with uracil(U). For example, G/U base-pairing is partially responsible for thedegeneracy (i.e., redundancy) of the genetic code in the context of tRNAanti-codon base-pairing with codons in mRNA. Thus, in the context ofthis disclosure, a guanine (G) (e.g., of a protein-binding segment(dsRNA duplex) of a subject guide nucleic acid molecule; of a targetnucleic acid base pairing with a guide nucleic acid, etc.) is consideredcomplementary to both a uracil (U) and to an adenine (A). For example,when a G/U base-pair can be made at a given nucleotide position of aprotein-binding segment (e.g., dsRNA duplex) of a subject guide nucleicacid molecule, the position is not considered to be non-complementary,but is instead considered to be complementary.

Hybridization and washing conditions are well known and exemplified inSambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: ALaboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press,Cold Spring Harbor (1989), particularly Chapter 11 and Table 11.1therein; and Sambrook, J. and Russell, W., Molecular Cloning: ALaboratory Manual, Third Edition, Cold Spring Harbor Laboratory Press,Cold Spring Harbor (2001). The conditions of temperature and ionicstrength determine the “stringency” of the hybridization.

Hybridization requires that the two nucleic acids contain complementarysequences, although mismatches between bases are possible. Theconditions appropriate for hybridization between two nucleic acidsdepend on the length of the nucleic acids and the degree ofcomplementarity, variables well known in the art. The greater the degreeof complementarity between two nucleotide sequences, the greater thevalue of the melting temperature (Tm) for hybrids of nucleic acidshaving those sequences. For hybridizations between nucleic acids withshort stretches of complementarity (e.g. complementarity over 35 orless, 30 or less, 25 or less, 22 or less, 20 or less, or 18 or lessnucleotides) the position of mismatches can become important (seeSambrook et al., supra, 11.7-11.8). Typically, the length for ahybridizable nucleic acid is 8 nucleotides or more (e.g., 10 nucleotidesor more, 12 nucleotides or more, 15 nucleotides or more, 20 nucleotidesor more, 22 nucleotides or more, 25 nucleotides or more, or 30nucleotides or more). The temperature and wash solution saltconcentration may be adjusted as necessary according to factors such aslength of the region of complementation and the degree ofcomplementation.

It is understood that the sequence of a polynucleotide need not be 100%complementary to that of its target nucleic acid to be specificallyhybridizable or hybridizable. Moreover, a polynucleotide may hybridizeover one or more segments such that intervening or adjacent segments arenot involved in the hybridization event (e.g., a loop structure orhairpin structure). A polynucleotide can comprise 60% or more, 65% ormore, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more,95% or more, 98% or more, 99% or more, 99.5% or more, or 100% sequencecomplementarity to a target region within the target nucleic acidsequence to which it will hybridize. For example, an antisense nucleicacid in which 18 of 20 nucleotides of the antisense compound arecomplementary to a target region, and would therefore specificallyhybridize, would represent 90 percent complementarity. In this example,the remaining noncomplementary nucleotides may be clustered orinterspersed with complementary nucleotides and need not be contiguousto each other or to complementary nucleotides. Percent complementaritybetween particular stretches of nucleic acid sequences within nucleicacids can be determined using any convenient method. Exemplary methodsinclude BLAST programs (basic local alignment search tools) andPowerBLAST programs (Altschul et al., J. Mol. Biol., 1990, 215, 403-410;Zhang and Madden, Genome Res., 1997, 7, 649-656) or by using the Gapprogram (Wisconsin Sequence Analysis Package, Version 8 for Unix,Genetics Computer Group, University Research Park, Madison Wis.), usingdefault settings, which uses the algorithm of Smith and Waterman (Adv.Appl. Math., 1981, 2, 482-489).

The terms “peptide,” “polypeptide,” and “protein” are usedinterchangeably herein, and refer to a polymeric form of amino acids ofany length, which can include coded and non-coded amino acids,chemically or biochemically modified or derivatized amino acids, andpolypeptides having modified peptide backbones.

“Binding” as used herein (e.g. with reference to an RNA-binding domainof a polypeptide, binding to a target nucleic acid, and the like) refersto a non-covalent interaction between macromolecules (e.g., between aprotein and a nucleic acid; between a subject Cas9/guide nucleic acidcomplex and a target nucleic acid; and the like). While in a state ofnon-covalent interaction, the macromolecules are said to be “associated”or “interacting” or “binding” (e.g., when a molecule X is said tointeract with a molecule Y, it is meant the molecule X binds to moleculeY in a non-covalent manner). Not all components of a binding interactionneed be sequence-specific (e.g., contacts with phosphate residues in aDNA backbone), but some portions of a binding interaction may besequence-specific. Binding interactions are generally characterized by adissociation constant (K_(d)) of less than 10⁻⁶ M, less than 10⁻⁷ M,less than 10⁻⁸ M, less than 10⁻⁹ M, less than 10⁻¹⁰ M, less than 10⁻¹¹M, less than 10⁻¹² M, less than 10⁻¹³ M, less than 10⁻¹⁴ M, or less than10⁻¹⁵ M. “Affinity” refers to the strength of binding, increased bindingaffinity being correlated with a lower K_(d).

By “binding domain” it is meant a protein domain that is able to bindnon-covalently to another molecule. A binding domain can bind to, forexample, a DNA molecule (a DNA-binding domain), an RNA molecule (anRNA-binding domain) and/or a protein molecule (a protein-bindingdomain). In the case of a protein having a protein-binding domain, itcan in some cases bind to itself (to form homodimers, homotrimers, etc.)and/or it can bind to one or more regions of a different protein orproteins.

The term “conservative amino acid substitution” refers to theinterchangeability in proteins of amino acid residues having similarside chains. For example, a group of amino acids having aliphatic sidechains consists of glycine, alanine, valine, leucine, and isoleucine; agroup of amino acids having aliphatic-hydroxyl side chains consists ofserine and threonine; a group of amino acids having amide containingside chains consisting of asparagine and glutamine; a group of aminoacids having aromatic side chains consists of phenylalanine, tyrosine,and tryptophan; a group of amino acids having basic side chains consistsof lysine, arginine, and histidine; a group of amino acids having acidicside chains consists of glutamate and aspartate; and a group of aminoacids having sulfur containing side chains consists of cysteine andmethionine. Exemplary conservative amino acid substitution groups are:valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine,alanine-valine-glycine, and asparagine-glutamine.

A polynucleotide or polypeptide has a certain percent “sequenceidentity” to another polynucleotide or polypeptide, meaning that, whenaligned, that percentage of bases or amino acids are the same, and inthe same relative position, when comparing the two sequences. Sequenceidentity can be determined in a number of different ways. To determinesequence identity, sequences can be aligned using various methods andcomputer programs (e.g., BLAST, T-COFFEE, MUSCLE, MAFFT, etc.),available over the world wide web at sites includingncbi.nlm.nili.gov/BLAST, ebi.ac.uk/Tools/msa/tcoffee/,ebi.ac.uk/Tools/msa/muscle/, mafft.cbrc.jp/alignment/software/. See,e.g., Altschul et al. (1990), J. Mol. Biol. 215:403-10.

A DNA sequence that “encodes” a particular RNA is a DNA nucleic acidsequence that is transcribed into RNA. A DNA polynucleotide may encodean RNA (mRNA) that is translated into protein, or a DNA polynucleotidemay encode an RNA that is not translated into protein (e.g. tRNA, rRNA,microRNA (miRNA), a “non-coding” RNA (ncRNA), a guide nucleic acid,etc.).

A “protein coding sequence” or a sequence that encodes a particularprotein or polypeptide, is a nucleic acid sequence that is transcribedinto mRNA (in the case of DNA) and is translated (in the case of mRNA)into a polypeptide in vitro or in vivo when placed under the control ofappropriate regulatory sequences. The boundaries of the coding sequenceare determined by a start codon at the 5′ terminus (N-terminus) and atranslation stop nonsense codon at the 3′ terminus (C-terminus). Acoding sequence can include, but is not limited to, cDNA fromprokaryotic or eukaryotic mRNA, genomic DNA sequences from prokaryoticor eukaryotic DNA, and synthetic nucleic acids. A transcriptiontermination sequence will usually be located 3′ to the coding sequence.

The terms “DNA regulatory sequences,” “control elements,” and“regulatory elements,” used interchangeably herein, refer totranscriptional and translational control sequences, such as promoters,enhancers, polyadenylation signals, terminators, protein degradationsignals, and the like, that provide for and/or regulate transcription ofa non-coding sequence (e.g., guide nucleic acid) or a coding sequence(e.g., Cas9 polypeptide, or Cas9 polypeptide) and/or regulatetranslation of an encoded polypeptide.

As used herein, a “promoter sequence” is a DNA regulatory region capableof binding RNA polymerase and initiating transcription of a downstream(3′ direction) coding or non-coding sequence. For purposes of thepresent disclosure, the promoter sequence is bounded at its 3′ terminusby the transcription initiation site and extends upstream (5′ direction)to include the minimum number of bases or elements necessary to initiatetranscription at levels detectable above background. Within the promotersequence will be found a transcription initiation site, as well asprotein binding domains responsible for the binding of RNA polymerase.Eukaryotic promoters will often, but not always, contain “TATA” boxesand “CAT” boxes. Various promoters, including inducible promoters, maybe used to drive the various vectors of the present disclosure.

The term “naturally-occurring” or “unmodified” or “wild type” as usedherein as applied to a nucleic acid, a polypeptide, a cell, or anorganism, refers to a nucleic acid, polypeptide, cell, or organism thatis found in nature. For example, a polypeptide or polynucleotidesequence that is present in an organism (including viruses) that can beisolated from a source in nature and which has not been intentionallymodified by a human in the laboratory is wild type (and naturallyoccurring).

The term “chimeric” as used herein as applied to a nucleic acid orpolypeptide refers to two components that are defined by structuresderived from different sources. For example, where “chimeric” is used inthe context of a chimeric polypeptide (e.g., a chimeric Cas9 protein),the chimeric polypeptide includes amino acid sequences that are derivedfrom different polypeptides. A chimeric polypeptide may comprise eithermodified or naturally-occurring polypeptide sequences (e.g., a firstamino acid sequence from a modified or unmodified Cas9 protein; and asecond amino acid sequence other than the Cas9 protein). Similarly,“chimeric” in the context of a polynucleotide encoding a chimericpolypeptide includes nucleotide sequences derived from different codingregions (e.g., a first nucleotide sequence encoding a modified orunmodified Cas9 protein; and a second nucleotide sequence encoding apolypeptide other than a Cas9 protein).

The term “chimeric polypeptide” refers to a polypeptide which is made bythe combination (i.e., “fusion”) of two otherwise separated segments ofamino sequence, usually through human intervention. A polypeptide thatcomprises a chimeric amino acid sequence is a chimeric polypeptide. Somechimeric polypeptides can be referred to as “fusion variants.”

“Heterologous,” as used herein, means a nucleotide or polypeptidesequence that is not found in the native nucleic acid or protein,respectively. For example, in a chimeric Cas9 protein, the RNA-bindingdomain of a naturally-occurring bacterial Cas9 polypeptide (or a variantthereof) may be fused to a heterologous polypeptide sequence (i.e. apolypeptide sequence from a protein other than Cas9 or a polypeptidesequence from another organism). The heterologous polypeptide sequencemay exhibit an activity (e.g., enzymatic activity) that will also beexhibited by the chimeric Cas9 protein (e.g., methyltransferaseactivity, acetyltransferase activity, kinase activity, ubiquitinatingactivity, etc.). A heterologous nucleic acid sequence may be linked to anaturally-occurring nucleic acid sequence (or a variant thereof) (e.g.,by genetic engineering) to generate a chimeric nucleotide sequenceencoding a chimeric polypeptide. As another example, in a fusion variantCas9 polypeptide, a variant Cas9 polypeptide may be fused to aheterologous polypeptide (i.e. a polypeptide other than Cas9), whichexhibits an activity that will also be exhibited by the fusion variantCas9 polypeptide. A heterologous nucleic acid sequence may be linked toa variant Cas9 polypeptide (e.g., by genetic engineering) to generate anucleotide sequence encoding a fusion variant polypeptide.

“Recombinant,” as used herein, means that a particular nucleic acid (DNAor RNA) is the product of various combinations of cloning, restriction,polymerase chain reaction (PCR) and/or ligation steps resulting in aconstruct having a structural coding or non-coding sequencedistinguishable from endogenous nucleic acids found in natural systems.DNA sequences encoding polypeptides can be assembled from cDNA fragmentsor from a series of synthetic oligonucleotides, to provide a syntheticnucleic acid which is capable of being expressed from a recombinanttranscriptional unit contained in a cell or in a cell-free transcriptionand translation system. Genomic DNA comprising the relevant sequencescan also be used in the formation of a recombinant gene ortranscriptional unit. Sequences of non-translated DNA may be present 5′or 3′ from the open reading frame, where such sequences do not interferewith manipulation or expression of the coding regions, and may indeedact to modulate production of a desired product by various mechanisms(see “DNA regulatory sequences”, below). Alternatively, DNA sequencesencoding RNA (e.g., guide nucleic acid) that is not translated may alsobe considered recombinant. Thus, e.g., the term “recombinant” nucleicacid refers to one which is not naturally occurring, e.g., is made bythe artificial combination of two otherwise separated segments ofsequence through human intervention. This artificial combination isoften accomplished by either chemical synthesis means, or by theartificial manipulation of isolated segments of nucleic acids, e.g., bygenetic engineering techniques. Such is usually done to replace a codonwith a codon encoding the same amino acid, a conservative amino acid, ora non-conservative amino acid. Alternatively, it is performed to jointogether nucleic acid segments of desired functions to generate adesired combination of functions. This artificial combination is oftenaccomplished by either chemical synthesis means, or by the artificialmanipulation of isolated segments of nucleic acids, e.g., by geneticengineering techniques. When a recombinant polynucleotide encodes apolypeptide, the sequence of the encoded polypeptide can be naturallyoccurring (“wild type”) or can be a variant (e.g., a mutant) of thenaturally occurring sequence. Thus, the term “recombinant” polypeptidedoes not necessarily refer to a polypeptide whose sequence does notnaturally occur. Instead, a “recombinant” polypeptide is encoded by arecombinant DNA sequence, but the sequence of the polypeptide can benaturally occurring (“wild type”) or non-naturally occurring (e.g., avariant, a mutant, etc.). Thus, a “recombinant” polypeptide is theresult of human intervention, but may be a naturally occurring aminoacid sequence.

A “vector” or “expression vector” is a replicon, such as plasmid, phage,virus, or cosmid, to which another DNA segment, i.e. an “insert”, may beattached so as to bring about the replication of the attached segment ina cell.

An “expression cassette” comprises a DNA coding sequence operably linkedto a promoter. “Operably linked” refers to a juxtaposition wherein thecomponents so described are in a relationship permitting them tofunction in their intended manner For instance, a promoter is operablylinked to a coding sequence if the promoter affects its transcription orexpression.

The terms “recombinant expression vector,” or “DNA construct” are usedinterchangeably herein to refer to a DNA molecule comprising a vectorand one insert. Recombinant expression vectors are usually generated forthe purpose of expressing and/or propagating the insert(s), or for theconstruction of other recombinant nucleotide sequences. The insert(s)may or may not be operably linked to a promoter sequence and may or maynot be operably linked to DNA regulatory sequences.

A cell has been “genetically modified” or “transformed” or “transfected”by exogenous DNA, e.g. a recombinant expression vector, when such DNAhas been introduced inside the cell. The presence of the exogenous DNAresults in permanent or transient genetic change. The transforming DNAmay or may not be integrated (covalently linked) into the genome of thecell. In prokaryotes, yeast, and mammalian cells for example, thetransforming DNA may be maintained on an episomal element such as aplasmid. With respect to eukaryotic cells, a stably transformed cell isone in which the transforming DNA has become integrated into achromosome so that it is inherited by daughter cells through chromosomereplication. This stability is demonstrated by the ability of theeukaryotic cell to establish cell lines or clones that comprise apopulation of daughter cells containing the transforming DNA. A “clone”is a population of cells derived from a single cell or common ancestorby mitosis. A “cell line” is a clone of a primary cell that is capableof stable growth in vitro for many generations.

Suitable methods of genetic modification (also referred to as“transformation”) include e.g., viral or bacteriophage infection,transfection, conjugation, protoplast fusion, lipofection,electroporation, calcium phosphate precipitation, polyethyleneimine(PEI)-mediated transfection, DEAE-dextran mediated transfection,liposome-mediated transfection, particle gun technology, calciumphosphate precipitation, direct micro injection, nanoparticle-mediatednucleic acid delivery (see, e.g., Panyam et., al Adv Drug Deliv Rev.2012 Sep. 13. pii: S0169-409X(12)00283-9. doi:10.1016/j.addr.2012.09.023), and the like.

The choice of method of genetic modification is generally dependent onthe type of cell being transformed and the circumstances under which thetransformation is taking place (e.g., in vitro, ex vivo, or in vivo). Ageneral discussion of these methods can be found in Ausubel, et al.,Short Protocols in Molecular Biology, 3rd ed., Wiley & Sons, 1995.

A “target nucleic acid” as used herein is a polynucleotide (e.g., DNAsuch as genomic DNA) that includes a “target site” or “target sequence.”The terms “target site” or “target sequence” are used interchangeablyherein to refer to a nucleic acid sequence present in a target nucleicacid to which a targeting segment of a subject guide nucleic acid (e.g.,Cas9 guide RNA) will bind (e.g., see FIG. 14A-14B, FIG. 15A-15B, FIG.16A-16C), provided sufficient conditions for binding exist. For example,the target site (or target sequence) 5′-GAGCAUAUC-3′ within a targetnucleic acid is targeted by (or is bound by, or hybridizes with, or iscomplementary to) the sequence 5′-GAUAUGCUC-3′. Suitable hybridizationconditions include physiological conditions normally present in a cell.For a double stranded target nucleic acid, the strand of the targetnucleic acid that is complementary to and hybridizes with the guidenucleic acid is referred to as the “complementary strand”; while thestrand of the target nucleic acid that is complementary to the“complementary strand” (and is therefore not complementary to the guidenucleic acid) is referred to as the “noncomplementary strand” or“non-complementary strand”. In cases where the target nucleic acid is asingle stranded target nucleic acid (e.g., single stranded DNA (ssDNA),single stranded RNA (ssRNA)), the guide nucleic acid is complementary toand hybridizes with single stranded target nucleic acid.

By “Cas9 polypeptide” or “Cas9 protein” or “site-directed polypeptide”or “site-directed Cas9 polypeptide” it is meant a polypeptide that bindsRNA (e.g., the protein binding segment of a guide nucleic acid) and istargeted to a specific sequence (a target site) in a target nucleicacid. A Cas9 polypeptide as described herein is targeted to a targetsite by the guide nucleic acid (e.g., Cas9 guide RNA) to which it isbound. The guide nucleic acid comprises a sequence that is complementaryto a target sequence within the target nucleic acid, thus targeting thebound Cas9 polypeptide to a specific location within the target nucleicacid (the target sequence) (e.g., stabilizing the interaction of Cas9with the target nucleic acid). In some cases, the Cas9 polypeptide is anaturally-occurring polypeptide (e.g., naturally occurs in bacterialand/or archaeal cells). In other cases, the Cas9 polypeptide is not anaturally-occurring polypeptide (e.g., the Cas9 polypeptide is a variantCas9 polypeptide, a chimeric polypeptide as discussed below, and thelike).Exemplary Cas9 polypeptides are set forth in SEQ ID NOs: 5-816 asa non-limiting and non-exhaustive list. Naturally occurring Cas9polypeptides bind a guide nucleic acid, are thereby directed to aspecific sequence within a target nucleic acid (a target site), andcleave the target nucleic acid (e.g., cleave dsDNA to generate a doublestrand break, cleave ssDNA, cleave ssRNA, etc.). A subject Cas9polypeptide comprises two portions, an RNA-binding portion and anactivity portion. An RNA-binding portion interacts with a subject guidenucleic acid. An activity portion exhibits site-directed enzymaticactivity (e.g., nuclease activity, activity for DNA and/or RNAmethylation, activity for DNA and/or RNA cleavage, activity for histoneacetylation, activity for histone methylation, activity for RNAmodification, activity for RNA-binding, activity for RNA splicing etc.).In some cases, the activity portion exhibits reduced nuclease activityrelative to the corresponding portion of a wild type Cas9 polypeptide.In some cases, the activity portion is enzymatically inactive.

By “cleavage” it is meant the breakage of the covalent backbone of atarget nucleic acid molecule (e.g., RNA, DNA). Cleavage can be initiatedby a variety of methods including, but not limited to, enzymatic orchemical hydrolysis of a phosphodiester bond. Both single-strandedcleavage and double-stranded cleavage are possible, and double-strandedcleavage can occur as a result of two distinct single-stranded cleavageevents. In certain embodiments, a complex comprising a guide nucleicacid and a Cas9 polypeptide is used for targeted cleavage of a singlestranded target nucleic acid (e.g., ssRNA, ssDNA).

“Nuclease” and “endonuclease” are used interchangeably herein to mean anenzyme which possesses catalytic activity for nucleic acid cleavage(e.g., ribonuclease activity (ribonucleic acid cleavage),deoxyribonuclease activity (deoxyribonucleic acid cleavage), etc.).

By “cleavage domain” or “active domain” or “nuclease domain” of anuclease it is meant the polypeptide sequence or domain within thenuclease which possesses the catalytic activity for nucleic acidcleavage. A cleavage domain can be contained in a single polypeptidechain or cleavage activity can result from the association of two (ormore) polypeptides. A single nuclease domain may consist of more thanone isolated stretch of amino acids within a given polypeptide.

A nucleic acid molecule that binds to the Cas9 polypeptide (Cas9protein) and targets the polypeptide to a specific location within thetarget nucleic acid is referred to herein as a “guide nucleic acid”.When the guide nucleic acid is an RNA molecule, it can be referred to asa “guide RNA” or a “gRNA” (e.g., a “Cas9 guide RNA”). A subject guidenucleic acid comprises two segments, a first segment (referred to hereinas a “targeting segment”); and a second segment (referred to herein as a“protein-binding segment”). By “segment” it is meant asegment/section/region of a molecule, e.g., a contiguous stretch ofnucleotides in a nucleic acid molecule. A segment can also mean aregion/section of a complex such that a segment may comprise regions ofmore than one molecule. For example, in some cases the protein-bindingsegment (described below) of a guide nucleic acid is one nucleic acidmolecule (e.g., one RNA molecule) and the protein-binding segmenttherefore comprises a region of that one molecule. In other cases, theprotein-binding segment (described below) of a guide nucleic acidcomprises two separate molecules that are hybridized along a region ofcomplementarity. As an illustrative, non-limiting example, aprotein-binding segment of a guide nucleic acid that comprises twoseparate molecules can comprise (i) base pairs 40-75 of a first molecule(e.g., RNA molecule, DNA/RNA hybrid molecule) that is 100 base pairs inlength; and (ii) base pairs 10-25 of a second molecule (e.g., RNAmolecule) that is 50 base pairs in length. The definition of “segment,”unless otherwise specifically defined in a particular context, is notlimited to a specific number of total base pairs, is not limited to anyparticular number of base pairs from a given nucleic acid molecule, isnot limited to a particular number of separate molecules within acomplex, and may include regions of nucleic acid molecules that are ofany total length and may or may not include regions with complementarityto other molecules.

The first segment (targeting segment) of a guide nucleic acid comprisesa nucleotide sequence (a targeting sequence, also referred to as a“guide sequence”) that is complementary to a specific sequence (a targetsite) within a target nucleic acid (e.g, a target ssRNA, a target ssDNA,the complementary strand of a double stranded target DNA, etc.). Asecond segment is the protein-binding segment (or “protein-bindingsequence”), which interacts with (binds to) a Cas9 polypeptide.Site-specific binding and/or cleavage of the target nucleic acid canoccur at locations determined by base-pairing complementarity betweenthe guide nucleic acid and the target nucleic acid. The protein-bindingsegment of a subject guide nucleic acid comprises two complementarystretches of nucleotides that hybridize to one another to form a doublestranded RNA duplex (dsRNA duplex).

In some embodiments, a subject nucleic acid (e.g., a guide nucleic acid,a nucleic acid comprising a nucleotide sequence encoding a guide nucleicacid (guide RNA); a nucleic acid encoding a Cas9 polypeptide; etc.)comprises a modification or sequence (e.g., an additional segment at the5′ and/or 3′ end) that provides for an additional desirable feature(e.g., modified or regulated stability; subcellular targeting; tracking,e.g., a fluorescent label; a binding site for a protein or proteincomplex; etc.). Non-limiting examples include: a 5′ cap (e.g., a7-methylguanylate cap (m7G)); a 3′ polyadenylated tail (i.e., a 3′poly(A) tail); a ribozyme sequence (e.g. to allow for self-cleavage andrelease of a mature molecule in a regulated fashion); a riboswitchsequence (e.g., to allow for regulated stability and/or regulatedaccessibility by proteins and/or protein complexes); a stability controlsequence; a sequence that forms a dsRNA duplex (i.e., a hairpin)); amodification or sequence that targets the nucleic acid to a subcellularlocation (e.g., nucleus, mitochondria, chloroplasts, and the like); amodification or sequence that provides for tracking (e.g., directconjugation to a fluorescent molecule, conjugation to a moiety thatfacilitates fluorescent detection, a sequence that allows forfluorescent detection, etc.); a modification or sequence that provides abinding site for proteins (e.g., proteins that act on DNA and/or RNA,including transcriptional activators, transcriptional repressors, DNAmethyltransferases, DNA demethylases, histone acetyltransferases,histone deacetylases, and the like); and combinations thereof.

A subject guide nucleic acid (e.g., a Cas9 guide RNA) and a subject Cas9polypeptide form a complex (i.e., bind via non-covalent interactions).The guide nucleic acid provides target specificity to the complex bycomprising a nucleotide sequence (a “targeting sequence”/“guidesequence”) that is complementary to a sequence of a target nucleic acid.The Cas9 polypeptide of the complex provides the site-specific activity(e.g., nuclease activity). In other words, the Cas9 polypeptide isguided to a target nucleic acid sequence (e.g. a target sequence in achromosomal nucleic acid; a target sequence in an extrachromosomalnucleic acid, e.g. an episomal nucleic acid, a minicircle, an ssRNA, anssDNA, etc.; a target sequence in a mitochondrial nucleic acid; a targetsequence in a chloroplast nucleic acid; a target sequence in a plasmid;etc.) by virtue of its association with the protein-binding segment ofthe guide nucleic acid.

In some embodiments, a subject guide nucleic acid comprises two separatenucleic acid molecules: an “activator” and a “targeter” (see below) andis referred to herein as a “dual guide nucleic acid”, a “double-moleculeguide nucleic acid”, or a “two-molecule guide nucleic acid.” If bothmolecules of a dual guide nucleic acid are RNA molecules, the dual guidenucleic acid can be referred to as a “dual guide RNA” or a “dgRNA.” Insome embodiments, the subject guide nucleic acid (Cas9 guide RNA) is asingle nucleic acid molecule (single polynucleotide) and is referred toherein as a “single guide nucleic acid”, a “single-molecule guidenucleic acid,” or a “one-molecule guide nucleic acid.” If a single guidenucleic acid is an RNA molecule, it can be referred to as a “singleguide RNA” or an “sgRNA.” The term “guide nucleic acid” is inclusive,referring to both dual guide nucleic acids and to single guide nucleicacids (e.g., dgRNAs, sgRNAs, etc.).

An dual guide nucleic acid comprises a crRNA-like (“CRISPR RNA” or“targeter” or “targeter RNA” or “crRNA” or “crRNA repeat”) molecule anda corresponding tracrRNA-like (“trans-acting CRISPR RNA” or “activator”or “activator RNA” or “tracrRNA”) molecule. In a single guide RNA, thetargeter RNA and activator RNA are linked together, e.g., withintervening nucleotides.

A crRNA-like molecule (targeter RNA) comprises both the targetingsegment (single stranded) of the guide nucleic acid and a stretch(“duplex-forming segment”) of nucleotides that forms one half of thedsRNA duplex of the protein-binding segment of the guide nucleic acid(Cas9 guide RNA). A corresponding tracrRNA-like molecule (activator RNA)comprises a stretch of nucleotides (duplex-forming segment) that formsthe other half of the dsRNA duplex of the protein-binding segment of theguide nucleic acid (guide RNA). In other words, a stretch of nucleotidesof a crRNA-like molecule are complementary to and hybridize with astretch of nucleotides of a tracrRNA-like molecule to form the dsRNAduplex of the protein-binding domain of the guide nucleic acid (Cas9guide RNA). As such, each crRNA-like molecule can be said to have acorresponding tracrRNA-like molecule. The crRNA-like moleculeadditionally provides the single stranded targeting segment. Thus, acrRNA-like and a tracrRNA-like molecule (as a corresponding pair)hybridize to form a dual guide nucleic acid (or a single guide nucleicacid when the activator RNA and targeter RNA are linked together, e.g.,by intervening nucleotides).

The exact sequence of a given crRNA or tracrRNA molecule ischaracteristic of the species in which the RNA molecules are found. Asubject Cas9 dual guide RNA or Cas9 single guide RNA can include anycorresponding activator and targeter pair. A Cas9 guide RNA (e.g. a dualguide RNA or a single guide RNA) can be comprised of any correspondingactivator and targeter pair. Non-limiting examples of nucleotidesequences that can be included in a Cas9 guide RNA (dgRNA or sgRNA)include sequences set forth in SEQ ID NOs: 827-1075, or complementsthereof. For example, in some cases, sequences from SEQ ID NOs: 827-957(which are from tracrRNAs) or complements thereof, can pair withsequences from SEQ ID NOs: 962-1075 (which are from crRNAs), orcomplements thereof, to form a dsRNA duplex of a protein bindingsegment.

The term “activator” or “activator RNA” is used herein to mean atracrRNA-like molecule of a dual guide nucleic acid (and of a singleguide nucleic acid when the “activator RNA” and the “targeter RNA” arelinked together, e.g., by intervening nucleic acids). The term“targeter” is used herein to mean a crRNA-like molecule of a dual guidenucleic acid (and of a single guide nucleic acid when the “activator”and the “targeter” are linked together by intervening nucleic acids).The term “duplex-forming segment” is used herein to mean the stretch ofnucleotides of an activator or a targeter that contributes to theformation of the dsRNA duplex by hybridizing to a stretch of nucleotidesof a corresponding activator or targeter molecule. In other words, anactivator (activator RNA) comprises a duplex-forming segment that iscomplementary to the duplex-forming segment of the correspondingtargeter (targeter RNA). As such, an activator comprises aduplex-forming segment while a targeter comprises both a duplex-formingsegment and the targeting segment of the guide nucleic acid. A subjectsingle guide nucleic acid can comprise an “activator” and a “targeter”where the “activator” and the “targeter” are covalently linked (e.g., byintervening nucleotides). Therefore, a subject dual guide nucleic acidcan be comprised of any corresponding activator and targeter pair.

A “host cell” or “target cell” as used herein, denotes an in vivo or invitro eukaryotic cell, a prokaryotic cell (e.g., bacterial or archaealcell), or a cell from a multicellular organism (e.g., a cell line)cultured as a unicellular entity, which eukaryotic or prokaryotic cellscan be, or have been, used as recipients for a nucleic acid, and includethe progeny of the original cell which has been transformed by thenucleic acid. It is understood that the progeny of a single cell may notnecessarily be completely identical in morphology or in genomic or totalDNA complement as the original parent, due to natural, accidental, ordeliberate mutation. A “recombinant host cell” (also referred to as a“genetically modified host cell”) is a host cell into which has beenintroduced a heterologous nucleic acid, e.g., an expression vector. Forexample, a subject bacterial host cell is a genetically modifiedbacterial host cell by virtue of introduction into a suitable bacterialhost cell of an exogenous nucleic acid (e.g., a plasmid or recombinantexpression vector) and a subject eukaryotic host cell is a geneticallymodified eukaryotic host cell (e.g., a mammalian germ cell), by virtueof introduction into a suitable eukaryotic host cell of an exogenousnucleic acid.

The term “stem cell” is used herein to refer to a cell (e.g., plant stemcell, vertebrate stem cell) that has the ability both to self-renew andto generate a differentiated cell type (see Morrison et al. (1997) Cell88:287-298). In the context of cell ontogeny, the adjective“differentiated”, or “differentiating” is a relative term. A“differentiated cell” is a cell that has progressed further down thedevelopmental pathway than the cell it is being compared with. Thus,pluripotent stem cells (described below) can differentiate intolineage-restricted progenitor cells (e.g., mesodermal stem cells), whichin turn can differentiate into cells that are further restricted (e.g.,neuron progenitors), which can differentiate into end-stage cells (i.e.,terminally differentiated cells, e.g., neurons, cardiomyocytes, etc.),which play a characteristic role in a certain tissue type, and may ormay not retain the capacity to proliferate further. Stem cells may becharacterized by both the presence of specific markers (e.g., proteins,RNAs, etc.) and the absence of specific markers. Stem cells may also beidentified by functional assays both in vitro and in vivo, particularlyassays relating to the ability of stem cells to give rise to multipledifferentiated progeny.

Stem cells of interest include pluripotent stem cells (PSCs). The term“pluripotent stem cell” or “PSC” is used herein to mean a stem cellcapable of producing all cell types of the organism. Therefore, a PSCcan give rise to cells of all germ layers of the organism (e.g., theendoderm, mesoderm, and ectoderm of a vertebrate). Pluripotent cells arecapable of forming teratomas and of contributing to ectoderm, mesoderm,or endoderm tissues in a living organism. Pluripotent stem cells ofplants are capable of giving rise to all cell types of the plant (e.g.,cells of the root, stem, leaves, etc.).

PSCs of animals can be derived in a number of different ways. Forexample, embryonic stem cells (ESCs) are derived from the inner cellmass of an embryo (Thomson et. al, Science. 1998 Nov. 6;282(5391):1145-7) whereas induced pluripotent stem cells (iPSCs) arederived from somatic cells (Takahashi et. al, Cell. 2007 Nov. 30;131(5):861-72; Takahashi et. al, Nat Protoc. 2007; 2(12):3081-9; Yu et.al, Science. 2007 Dec. 21; 318(5858):1917-20. Epub 2007 Nov. 20).Because the term PSC refers to pluripotent stem cells regardless oftheir derivation, the term PSC encompasses the terms ESC and iPSC, aswell as the term embryonic germ stem cells (EGSC), which are anotherexample of a PSC. PSCs may be in the form of an established cell line,they may be obtained directly from primary embryonic tissue, or they maybe derived from a somatic cell. PSCs can be target cells of the methodsdescribed herein.

By “embryonic stem cell” (ESC) is meant a PSC that was isolated from anembryo, typically from the inner cell mass of the blastocyst. ESC linesare listed in the NIH Human Embryonic Stem Cell Registry, e.g.hESBGN-01, hESBGN-02, hESBGN-03, hESBGN-04 (BresaGen, Inc.); HES-1,HES-2, HES-3, HES-4, HES-5, HES-6 (ES Cell International); Miz-hES1(MizMedi Hospital-Seoul National University); HSF-1, HSF-6 (Universityof California at San Francisco); and H1, H7, H9, H13, H14 (WisconsinAlumni Research Foundation (WiCell Research Institute)). Stem cells ofinterest also include embryonic stem cells from other primates, such asRhesus stem cells and marmoset stem cells. The stem cells may beobtained from any mammalian species, e.g. human, equine, bovine,porcine, canine, feline, rodent, e.g. mice, rats, hamster, primate, etc.(Thomson et al. (1998) Science 282:1145; Thomson et al. (1995) Proc.Natl. Acad. Sci USA 92:7844; Thomson et al. (1996) Biol. Reprod. 55:254;Shamblott et al., Proc. Natl. Acad. Sci. USA 95:13726, 1998). Inculture, ESCs typically grow as flat colonies with largenucleo-cytoplasmic ratios, defined borders and prominent nucleoli. Inaddition, ESCs express SSEA-3, SSEA-4, TRA-1-60, TRA-1-81, and AlkalinePhosphatase, but not SSEA-1. Examples of methods of generating andcharacterizing ESCs may be found in, for example, U.S. Pat. Nos.7,029,913, 5,843,780, and 6,200,806, the disclosures of which areincorporated herein by reference. Methods for proliferating hESCs in theundifferentiated form are described in WO 99/20741, WO 01/51616, and WO03/020920.

By “embryonic germ stem cell” (EGSC) or “embryonic germ cell” or “EGcell” is meant a PSC that is derived from germ cells and/or germ cellprogenitors, e.g. primordial germ cells, i.e. those that would becomesperm and eggs. Embryonic germ cells (EG cells) are thought to haveproperties similar to embryonic stem cells as described above. Examplesof methods of generating and characterizing EG cells may be found in,for example, U.S. Pat. No. 7,153,684; Matsui, Y., et al., (1992) Cell70:841; Shamblott, M., et al. (2001) Proc. Natl. Acad. Sci. USA 98: 113;Shamblott, M., et al. (1998) Proc. Natl. Acad. Sci. USA, 95:13726; andKoshimizu, U., et al. (1996) Development, 122:1235, the disclosures ofwhich are incorporated herein by reference.

By “induced pluripotent stem cell” or “iPSC” it is meant a PSC that isderived from a cell that is not a PSC (i.e., from a cell this isdifferentiated relative to a PSC). iPSCs can be derived from multipledifferent cell types, including terminally differentiated cells. iPSCshave an ES cell-like morphology, growing as flat colonies with largenucleo-cytoplasmic ratios, defined borders and prominent nuclei. Inaddition, iPSCs express one or more key pluripotency markers known byone of ordinary skill in the art, including but not limited to AlkalinePhosphatase, SSEA3, SSEA4, Sox2, Oct3/4, Nanog, TRA160, TRA181, TDGF 1,Dnmt3b, FoxD3, GDF3, Cyp26a1, TERT, and zfp42. Examples of methods ofgenerating and characterizing iPSCs may be found in, for example, U.S.Patent Publication Nos. US20090047263, US20090068742, US20090191159,US20090227032, US20090246875, and US20090304646, the disclosures ofwhich are incorporated herein by reference. Generally, to generateiPSCs, somatic cells are provided with reprogramming factors (e.g. Oct4,SOX2, KLF4, MYC, Nanog, Lin28, etc.) known in the art to reprogram thesomatic cells to become pluripotent stem cells.

By “somatic cell” it is meant any cell in an organism that, in theabsence of experimental manipulation, does not ordinarily give rise toall types of cells in an organism. In other words, somatic cells arecells that have differentiated sufficiently that they will not naturallygenerate cells of all three germ layers of the body, i.e. ectoderm,mesoderm and endoderm. For example, somatic cells would include bothneurons and neural progenitors, the latter of which may be able tonaturally give rise to all or some cell types of the central nervoussystem but cannot give rise to cells of the mesoderm or endodermlineages.

By “mitotic cell” it is meant a cell undergoing mitosis. Mitosis is theprocess by which a eukaryotic cell separates the chromosomes in itsnucleus into two identical sets in two separate nuclei. It is generallyfollowed immediately by cytokinesis, which divides the nuclei,cytoplasm, organelles and cell membrane into two cells containingroughly equal shares of these cellular components.

By “post-mitotic cell” it is meant a cell that has exited from mitosis,i.e., it is “quiescent”, i.e. it is no longer undergoing divisions. Thisquiescent state may be temporary, i.e. reversible, or it may bepermanent.

By “meiotic cell” it is meant a cell that is undergoing meiosis. Meiosisis the process by which a cell divides its nuclear material for thepurpose of producing gametes or spores. Unlike mitosis, in meiosis, thechromosomes undergo a recombination step which shuffles genetic materialbetween chromosomes. Additionally, the outcome of meiosis is four(genetically unique) haploid cells, as compared with the two(genetically identical) diploid cells produced from mitosis.

The terms “treatment”, “treating” and the like are used herein togenerally mean obtaining a desired pharmacologic and/or physiologiceffect. The effect may be prophylactic in terms of completely orpartially preventing a disease or symptom thereof and/or may betherapeutic in terms of a partial or complete cure for a disease and/oradverse effect attributable to the disease. “Treatment” as used hereincovers any treatment of a disease or symptom in a mammal, and includes:(a) preventing the disease or symptom from occurring in a subject whichmay be predisposed to acquiring the disease or symptom but has not yetbeen diagnosed as having it; (b) inhibiting the disease or symptom,i.e., arresting its development; or (c) relieving the disease, i.e.,causing regression of the disease. The therapeutic agent may beadministered before, during or after the onset of disease or injury. Thetreatment of ongoing disease, where the treatment stabilizes or reducesthe undesirable clinical symptoms of the patient, is of particularinterest. Such treatment is desirably performed prior to complete lossof function in the affected tissues. The subject therapy will desirablybe administered during the symptomatic stage of the disease, and in somecases after the symptomatic stage of the disease.

The terms “individual,” “subject,” “host,” and “patient,” are usedinterchangeably herein and refer to any mammalian subject for whomdiagnosis, treatment, or therapy is desired, particularly humans.

In some instances, a component (e.g., a nucleic acid component (e.g., aCas9 guide RNA); a protein component (e.g., a Cas9 polypeptide, avariant Cas9 polypeptide); and the like) includes a label moiety. Theterms “label”, “detectable label”, or “label moiety” as used hereinrefer to any moiety that provides for signal detection and may varywidely depending on the particular nature of the assay. Label moietiesof interest include both directly detectable labels (direct labels)(e.g., a fluorescent label) and indirectly detectable labels (indirectlabels) (e.g., a binding pair member). A fluorescent label can be anyfluorescent label (e.g., a fluorescent dye (e.g., fluorescein, Texasred, rhodamine, ALEXAFLUOR® labels, and the like), a fluorescent protein(e.g., green fluorescent protein (GFP), enhanced GFP (EGFP), yellowfluorescent protein (YFP), red fluorescent protein (RFP), cyanfluorescent protein (CFP), cherry, tomato, tangerine, and anyfluorescent derivative thereof), etc.). Suitable detectable (directly orindirectly) label moieties for use in the methods include any moietythat is detectable by spectroscopic, photochemical, biochemical,immunochemical, electrical, optical, chemical, or other means. Forexample, suitable indirect labels include biotin (a binding pairmember), which can be bound by streptavidin (which can itself bedirectly or indirectly labeled). Labels can also include: a radiolabel(a direct label) (e.g., ³H, ¹²⁵I, ³⁵S, ¹⁴C, or ³²P); an enzyme (anindirect label) (e.g., peroxidase, alkaline phosphatase, galactosidase,luciferase, glucose oxidase, and the like); a fluorescent protein (adirect label) (e.g., green fluorescent protein, red fluorescent protein,yellow fluorescent protein, and any convenient derivatives thereof); ametal label (a direct label); a colorimetric label; a binding pairmember; and the like. By “partner of a binding pair” or “binding pairmember” is meant one of a first and a second moiety, wherein the firstand the second moiety have a specific binding affinity for each other.Suitable binding pairs include, but are not limited to:antigen/antibodies (for example, digoxigenin/anti-digoxigenin,dinitrophenyl (DNP)/anti-DNP, dansyl-X-anti-dansyl,fluorescein/anti-fluorescein, lucifer yellow/anti-lucifer yellow, andrhodamine anti-rhodamine), biotin/avidin (or biotin/streptavidin) andcalmodulin binding protein (CBP)/calmodulin. Any binding pair member canbe suitable for use as an indirectly detectable label moiety.

Any given component, or combination of components can be unlabeled, orcan be detectably labeled with a label moiety. In some cases, when twoor more components are labeled, they can be labeled with label moietiesthat are distinguishable from one another.

General methods in molecular and cellular biochemistry can be found insuch standard textbooks as Molecular Cloning: A Laboratory Manual, 3rdEd. (Sambrook et al., HaRBor Laboratory Press 2001); Short Protocols inMolecular Biology, 4th Ed. (Ausubel et al. eds., John Wiley & Sons1999); Protein Methods (Bollag et al., John Wiley & Sons 1996); NonviralVectors for Gene Therapy (Wagner et al. eds., Academic Press 1999);Viral Vectors (Kaplift & Loewy eds., Academic Press 1995); ImmunologyMethods Manual (I. Lefkovits ed., Academic Press 1997); and Cell andTissue Culture: Laboratory Procedures in Biotechnology (Doyle &Griffiths, John Wiley & Sons 1998), the disclosures of which areincorporated herein by reference.

Before the present invention is further described, it is to beunderstood that this invention is not limited to particular embodimentsdescribed, as such may, of course, vary. It is also to be understoodthat the terminology used herein is for the purpose of describingparticular embodiments only, and is not intended to be limiting, sincethe scope of the present invention will be limited only by the appendedclaims.

Where a range of values is provided, it is understood that eachintervening value, to the tenth of the unit of the lower limit unlessthe context clearly dictates otherwise, between the upper and lowerlimit of that range and any other stated or intervening value in thatstated range, is encompassed within the invention. The upper and lowerlimits of these smaller ranges may independently be included in thesmaller ranges, and are also encompassed within the invention, subjectto any specifically excluded limit in the stated range. Where the statedrange includes one or both of the limits, ranges excluding either orboth of those included limits are also included in the invention.

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention belongs. Although any methods andmaterials similar or equivalent to those described herein can also beused in the practice or testing of the present invention, the preferredmethods and materials are now described. All publications mentionedherein are incorporated herein by reference to disclose and describe themethods and/or materials in connection with which the publications arecited.

It must be noted that as used herein and in the appended claims, thesingular forms “a,” “an,” and “the” include plural referents unless thecontext clearly dictates otherwise. Thus, for example, reference to “aCas9-guide RNA complex” includes a plurality of such complexes andreference to “the target nucleic acid” includes reference to one or moretarget nucleic acids and equivalents thereof known to those skilled inthe art, and so forth. It is further noted that the claims may bedrafted to exclude any optional element. As such, this statement isintended to serve as antecedent basis for use of such exclusiveterminology as “solely,” “only” and the like in connection with therecitation of claim elements, or use of a “negative” limitation.

It is appreciated that certain features of the invention, which are, forclarity, described in the context of separate embodiments, may also beprovided in combination in a single embodiment. Conversely, variousfeatures of the invention, which are, for brevity, described in thecontext of a single embodiment, may also be provided separately or inany suitable sub-combination. All combinations of the embodimentspertaining to the invention are specifically embraced by the presentinvention and are disclosed herein just as if each and every combinationwas individually and explicitly disclosed. In addition, allsub-combinations of the various embodiments and elements thereof arealso specifically embraced by the present invention and are disclosedherein just as if each and every such sub-combination was individuallyand explicitly disclosed herein.

The publications discussed herein are provided solely for theirdisclosure prior to the filing date of the present application. Nothingherein is to be construed as an admission that the present invention isnot entitled to antedate such publication by virtue of prior invention.Further, the dates of publication provided may be different from theactual publication dates which may need to be independently confirmed.

DETAILED DESCRIPTION

The present disclosure provides a system for editing genomic DNA, thesystem comprising an asymmetric donor DNA template; and methods ofediting genomic DNA involving use of an asymmetric donor DNA template.The present disclosure provides a system for editing genomic DNA, thesystem comprising a Cas9 polypeptide with reduced enzymatic activity;and methods of editing genomic DNA involving use of a Cas9 polypeptidewith reduced enzymatic activity.

System Comprising an Asymmetric Donor DNA

The present disclosure provides a system for editing genomic DNA(“target genomic DNA”). The system comprises: a) a Cas9 guide RNA, orone or more nucleic acids comprising nucleotide sequences encoding theCas9 guide RNA; and b) an asymmetric double-stranded or single-strandeddonor DNA (also referred to as a “donor DNA template” or a “donor DNAmolecule”). In some cases, the system comprises: a) a Cas9 guide RNA, orone or more nucleic acids comprising nucleotide sequences encoding theCas9 guide RNA; b) an asymmetric double-stranded or single-strandeddonor DNA (also referred to as a “donor DNA template” or a “donor DNAmolecule”); and c) a Cas9 polypeptide or a nucleic acid comprising anucleotide sequence encoding the Cas9 polypeptide. In some cases, thedonor DNA is single stranded. In some cases, the donor DNA is doublestranded. For simplicity, a system of the present disclosure thatcomprises an asymmetric donor DNA can be referred to as an “asymmetricdonor DNA system.”

In some cases, an asymmetric donor DNA system of the present disclosurecomprises: a) a Cas9 guide RNA, or one or more nucleic acids encodingthe Cas9 guide RNA, where the Cas9 guide RNA comprises a guide sequencethat is complementary to a target sequence of a target strand of genomicDNA of a eukaryotic cell; and b) an asymmetric double stranded or singlestranded donor DNA molecule comprising a 5′ homology arm and a 3′homology arm, wherein the 3′ homology arm is 20 to 50 nucleotides inlength, is shorter than the 5′ homology arm, and comprises at least 10consecutive nucleotides of the target sequence.

Donor DNA

An asymmetric donor DNA suitable for inclusion in a system of thepresent disclosure is in some cases single stranded, and in other casesdouble stranded. An asymmetric donor DNA suitable for inclusion in asystem of the present disclosure comprises a 5′ homology arm (alsoreferred to herein as a “long homology arm” or as “homology arm 2”) anda 3′ homology arm (also referred to herein as a “short homology arm” oras “homology arm 1”)). See, e.g., FIG. 14A-14B.

5′ Homology Arm

The 5′ homology arm of an asymmetric donor DNA of the present disclosurehas a length of from about 40 nucleotides (nt) to about 200 nt, e.g.,from 40 nt to 45 nt, from 45 nt to 50 nt, from 50 nt to 55 nt, from 55nt to 60 nt, from 60 nt to 65 nt, from 65 nt to 70 nt, from 70 nt to 75nt, from 75 nt to 80 nt, from 80 nt to 85 nt, from 85 nt, to 90 nt, from90 nt to 95 nt, from 95 nt to 100 nt, from 100 nt to 105 nt, from 105 ntto 110 nt, from 110 nt to 115 nt, from 115 nt to 120 nt, from 120 nt to125 nt, from 125 nt to 130 nt, from 130 nt to 135 nt, from 135 nt to 140nt, from 140 nt to 145 nt, from 145 nt to 150 nt, from 150 nt to 155 nt,from 155 nt to 160 nt, from 160 nt to 165 nt, from 165 nt to 170 nt,from 170 nt to 175 nt, from 175 nt to 180 nt, from 180 nt to 185 nt,from 185 nt to 190 nt, from 190 nt to 195 nt, or from 195 nt to 200 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 45 nt to 50 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 45 nt, 46 nt, 47 nt, 48 nt, 49 nt, or 50 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 50 nt to 55 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 50 nt, 51 nt, 52 nt, 53 nt, 54 nt, or 55 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 55 nt to 60 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 55 nt, 56 nt, 57 nt, 58 nt, 59 nt, or 60 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 65 nt to 70 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 65 nt, 66 nt, 67 nt, 68 nt, 69 nt, or 70 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 70 nt to 75 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 70 nt, 71 nt, 72 nt, 73 nt, 74 nt, or 75 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 75 nt to 80 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 75 nt, 76 nt, 77 nt, 78 nt, 79 nt, or 80 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 80 nt to 85 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 80 nt, 81 nt, 82 nt, 83 nt, 84 nt, or 85 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 85 nt to 90 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 85 nt, 86 nt, 87 nt 88 nt, 89 nt, or 90 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 90 nt to 95 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 90 nt, 91 nt, 92 nt, 93 nt, 94 nt, or 95 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 95 nt to 100 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 95 nt, 96 nt, 97 nt, 98 nt, 99 nt, or 100 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 100 nt to 105 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 100 nt, 101 nt, 102 nt, 103 nt, 104 nt, or 105 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 105 nt to 115 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 105 nt, 106 nt, 107 nt, 108 nt, 109 nt, 110 nt, 111 nt,112 nt, 113 nt, 114 nt, or 115 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 115 nt to 125 nt, e.g., the 5′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 115 nt, 116 nt, 117 nt, 118 nt, 119 nt, 120 nt, 121 nt,122 nt, 123 nt, 124 nt, or 125 nt.

In some cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 115 nt to 120 nt. In some cases,the 5′ homology arm of an asymmetric donor DNA of the present disclosurehas a length of from 120 nt to 125 nt. In some cases, the 5′ homologyarm of an asymmetric donor DNA of the present disclosure has a length offrom 125 nt to 135 nt. In some cases, the 5′ homology arm of anasymmetric donor DNA of the present disclosure has a length of from 135nt to 140 nt. In some cases, the 5′ homology arm of an asymmetric donorDNA of the present disclosure has a length of from 140 nt to 145 nt. Insome cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 145 nt to 150 nt. In some cases,the 5′ homology arm of an asymmetric donor DNA of the present disclosurehas a length of from 150 nt to 155 nt. In some cases, the 5′ homologyarm of an asymmetric donor DNA of the present disclosure has a length offrom 155 nt to 160 nt. In some cases, the 5′ homology arm of anasymmetric donor DNA of the present disclosure has a length of from 160nt to 165 nt. In some cases, the 5′ homology arm of an asymmetric donorDNA of the present disclosure has a length of from 165 nt to 170 nt. Insome cases, the 5′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 175 nt to 180 nt. In some cases,the 5′ homology arm of an asymmetric donor DNA of the present disclosurehas a length of from 180 nt to 185 nt. In some cases, the 5′ homologyarm of an asymmetric donor DNA of the present disclosure has a length offrom 185 nt to 190 nt. In some cases, the 5′ homology arm of anasymmetric donor DNA of the present disclosure has a length of from 190nt to 195 nt. In some cases, the 5′ homology arm of an asymmetric donorDNA of the present disclosure has a length of from 195 nt to 200 nt.

3′ Homology Arm

As noted above, in some cases, an asymmetric donor DNA system of thepresent disclosure comprises: a) a Cas9 guide RNA, or one or morenucleic acids encoding the Cas9 guide RNA, where the Cas9 guide RNAcomprises a guide sequence that is complementary to a target sequence ofa target strand of genomic DNA of a eukaryotic cell; and b) anasymmetric double stranded or single stranded donor DNA moleculecomprising a 5′ homology arm and a 3′ homology arm, wherein the 3′homology arm is 20 nucleotides to 50 nucleotides in length, is shorterthan the 5′ homology arm, and comprises at least 10 consecutivenucleotides of the target sequence.

The 3′ homology arm of an asymmetric donor DNA of the present disclosurecan have a length of from about 20 nucleotides (nt) to 50 nt, e.g., from20 nt to 25 nt, from 25 nt to 30 nt, from 30 nt to 35 nt, from 35 nt to40 nt, from 40 nt to 45 nt, or from 45 nt to 50 nt.

In some cases, the 3′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 20 nt to 25 nt, e.g., the 3′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, or 25 nt.

In some cases, the 3′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 25 nt to 30 nt, e.g., the 3′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, or 30 nt.

In some cases, the 3′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 30 nt to 35 nt, e.g., the 3′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 30 nt, 31 nt, 32 nt, 33 nt, 34 nt, or 35 nt.

In some cases, the 3′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 35 nt to 40 nt, e.g., the 3′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 35 nt, 36 nt, 37 nt, 38 nt, 39 nt, or 40 nt.

In some cases, the 3′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 40 nt to 45 nt, e.g., the 3′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 40 nt, 41 nt, 42 nt, 43 nt, 44 nt, or 45 nt.

In some cases, the 3′ homology arm of an asymmetric donor DNA of thepresent disclosure has a length of from 45 nt to 50 nt, e.g., the 3′homology arm of an asymmetric donor DNA of the present disclosure canhave a length of 45 nt, 46 nt, 47 nt, 48 nt, 49 nt, or 50 nt.

The '3 homology arm of an asymmetric donor DNA suitable for inclusion ina asymmetric donor DNA system of the present disclosure is shorter thanthe 5′ homology arm. For example, in some cases, the ratio of the lengthof the 5′ homology arm to the 3′ homology arm of an asymmetric donor DNAof the present disclosure ranges from 1.5:1 to 5:1, e.g., from 1.5:1 to2:1, from 2:1 to 2.5:1, from 2.5:1 to 3:1, from 3:1 to 3.5:1, from 3.5:1to 4:1, from 4:1 to 4.5:1, or from 4.5:1 to 5:1. In some cases, theratio of the length of the 5′ homology arm to the 3′ homology arm of anasymmetric donor DNA of the present disclosure is from 2:1 to 3:1. Insome cases, the ratio of the length of the 5′ homology arm to the 3′homology arm of an asymmetric donor DNA of the present disclosure isfrom 2.5:1 to 3:1.

In some cases, the ratio of the length of the 5′ homology arm to the 3′homology arm of an asymmetric donor DNA of the present disclosure rangesfrom 5:1 to 10:1, e.g., from 5:1 to 6:1, from 6:1 to 7:1, from 7:1 to8:1, from 8:1 to 9:1, or from 9:1 to 10:1.

As noted above, in some cases, an asymmetric donor DNA system of thepresent disclosure comprises: a) a Cas9 guide RNA, or one or morenucleic acids encoding the Cas9 guide RNA, where the Cas9 guide RNAcomprises a guide sequence that is complementary to a target sequence ofa target strand of genomic DNA of a eukaryotic cell; and b) anasymmetric double stranded or single stranded donor DNA moleculecomprising a 5′ homology arm and a 3′ homology arm, wherein the 3′homology arm is 20 to 50 nucleotides in length, is shorter than the 5′homology arm, and comprises at least 10 consecutive nucleotides of thetarget sequence.

Throughout this disclosure (e.g., in the paragraphs below), as isdepicted in FIG. 14B, when referring to a 3′ homology arm of a subjectasymmetric donor DNA (or symmetric donor DNA), the phrase “can comprise[x] consecutive nucleotides that are identical to the same number ofconsecutive nucleotides of the target sequence in the target strand” asused herein can be used interchangeably with the phrase “can comprise[x] consecutive nucleotides that are 100% complementary to the samenumber of consecutive nucleotides of the guide sequence (targetingsequence) of the Cas9 guide RNA.” Thus, the description of the 3′homology arm of a subject asymmetric (or symmetric) donor DNA can bephrased to have identity with the target strand of the target DNA, orcan instead be phrased to have complementarity with (or be complementaryto) the guide sequence (targeting sequence) of the Cas9 guide RNA. Thisis because (i) both the donor DNA and the target strand of the targetDNA have complementarity with (are complementary to) the non-targetstrand of the target DNA; and (ii) both the guide sequence (targetingsequence) of the Cas9 guide RNA and the non-target strand of the targetDNA have complementarity with (are complementary to) the target strandof the target DNA.

To the contrary, in some cases, the 5′ homology arm of a subjectasymmetric donor DNA does not “comprise [x] consecutive nucleotides thatare identical to the same number of consecutive nucleotides of thetarget sequence in the target strand” (e.g., does not “comprise [x]consecutive nucleotides that are 100% complementary to the same numberof consecutive nucleotides of the guide sequence (targeting sequence) ofthe Cas9 guide RNA).

Another way throughout this disclosure to describe the above feature ofthe 3′ homology arm of a subject asymmetric donor DNA (e.g., as isdepicted in FIG. 14B) is to say that the 3′ homology arm of a subjectasymmetric donor DNA (after the non-target strand of the target DNA iscleaved into a PAM distal non-target DNA strand and a PAM proximalnon-target DNA strand) hybridizes to (i.e., has “complementarity with”,is “complementary to”) the PAM distal non-target DNA strand. Thus, thisphrasing can be substituted for either of the phrases (i) “can comprise[x] consecutive nucleotides that are identical to the same number ofconsecutive nucleotides of the target sequence in the target strand” or(ii) “can comprise [x] consecutive nucleotides that are 100%complementary to the same number of consecutive nucleotides of the guidesequence (targeting sequence) of the Cas9 guide RNA.” In other words, a3′ homology arm can be described as such: “can comprise [x] consecutivenucleotides that are 100% complementary to the same number ofconsecutive nucleotides of the PAM distal non-target DNA strand.”

To the contrary, in some cases, the 5′ homology arm of a subjectasymmetric donor DNA (after the non-target strand of the target DNA iscleaved into a PAM distal non-target DNA strand and a PAM proximalnon-target DNA strand) does not hybridize to (i.e., does not have“complementarity with”, is not “complementary to”) the PAM distalnon-target DNA strand (it instead has complementarity with the PAMproximal non-target strand).

The 3′ homology arm of an asymmetric donor DNA suitable for inclusion inan asymmetric donor DNA system of the present disclosure can comprisefrom 5 consecutive nucleotides to 50 consecutive nucleotides that areidentical to the same number of consecutive nucleotides of the targetsequence in the target strand. For example, the 3′ homology arm of anasymmetric donor DNA suitable for inclusion in an asymmetric donor DNAsystem of the present disclosure can comprise from 5 consecutivenucleotides to 10 consecutive nucleotides, from 10 consecutivenucleotides to 15 consecutive nucleotides, from 15 consecutivenucleotides to 20 consecutive nucleotides, from 20 consecutivenucleotides to 25 consecutive nucleotides, from 25 consecutivenucleotides to 30 consecutive nucleotides, from 30 consecutivenucleotides to 35 consecutive nucleotides, from 35 consecutivenucleotides to 40 consecutive nucleotides, from 40 consecutivenucleotides to 45 consecutive nucleotides, from 10 consecutivenucleotides to 30 consecutive nucleotides, from 15 consecutivenucleotides to 30 consecutive nucleotides, from 18 consecutivenucleotides to 30 consecutive nucleotides, from 20 consecutivenucleotides to 30 consecutive nucleotides, from 10 consecutivenucleotides to 25 consecutive nucleotides, from 15 consecutivenucleotides to 25 consecutive nucleotides, from 18 consecutivenucleotides to 25 consecutive nucleotides, from 20 consecutivenucleotides to 25 consecutive nucleotides, or from 45 consecutivenucleotides to 50 consecutive nucleotides that are identical to the samenumber of consecutive nucleotides of the target sequence in the targetstrand.

In some cases, the 3′ homology arm of an asymmetric donor DNA suitablefor inclusion in an asymmetric donor DNA system of the presentdisclosure comprises at least 5 consecutive nucleotides that areidentical to 5 consecutive nucleotides of the target sequence in thetarget strand. In some cases, the 3′ homology arm of an asymmetric donorDNA suitable for inclusion in an asymmetric donor DNA system of thepresent disclosure comprises at least 10 consecutive nucleotides thatare identical to 10 consecutive nucleotides of the target sequence inthe target strand. In some cases, the 3′ homology arm of an asymmetricdonor DNA suitable for inclusion in an asymmetric donor DNA system ofthe present disclosure comprises at least 15 consecutive nucleotidesthat are identical to 15 consecutive nucleotides of the target sequencein the target strand. In some cases, the 3′ homology arm of anasymmetric donor DNA suitable for inclusion in an asymmetric donor DNAsystem of the present disclosure comprises at least 20 consecutivenucleotides that are identical to 20 consecutive nucleotides of thetarget sequence in the target strand. In some cases, the 3′ homology armof an asymmetric donor DNA suitable for inclusion in a asymmetric donorDNA system of the present disclosure comprises at least 25 consecutivenucleotides that are identical to 25 consecutive nucleotides of thetarget sequence in the target strand. In some cases, the 3′ homology armof an asymmetric donor DNA suitable for inclusion in an asymmetric donorDNA system of the present disclosure comprises at least 30 consecutivenucleotides that are identical to 30 consecutive nucleotides of thetarget sequence in the target strand. In some cases, the 3′ homology armof an asymmetric donor DNA suitable for inclusion in an asymmetric donorDNA system of the present disclosure comprises at least 35 consecutivenucleotides that are identical to 35 consecutive nucleotides of thetarget sequence in the target strand. In some cases, the 3′ homology armof an asymmetric donor DNA suitable for inclusion in an asymmetric donorDNA system of the present disclosure comprises at least 40 consecutivenucleotides that are identical to 40 consecutive nucleotides of thetarget sequence in the target strand. In some cases, the 3′ homology armof an asymmetric donor DNA suitable for inclusion in a asymmetric donorDNA system of the present disclosure comprises at least 45 consecutivenucleotides that are identical to 45 consecutive nucleotides of thetarget sequence in the target strand. In some cases, the 3′ homology armof an asymmetric donor DNA suitable for inclusion in an asymmetric donorDNA system of the present disclosure comprises 50 consecutivenucleotides that are identical to 50 consecutive nucleotides of thetarget sequence in the target strand.

An asymmetric donor DNA suitable for inclusion in an asymmetric donorDNA system of the present disclosure can include: a) a 5′ homology armhaving a length of from 85 nt to 95 nt; and b) a 3′ homology arm havinga length of from 20 nt to 50 nt, where the 3′ homology arm comprises atleast 5 consecutive nucleotides that are identical to 5 consecutivenucleotides of the target sequence of a target strand of genomic DNA. Anasymmetric donor DNA suitable for inclusion in an asymmetric donor DNAsystem of the present disclosure can include: a) a 5′ homology armhaving a length of from 85 nt to 95 nt; and b) a 3′ homology arm havinga length of from 20 nt to 50 nt, where the 3′ homology arm comprises atleast 10 consecutive nucleotides that are identical to 10 consecutivenucleotides of the target sequence of a target strand of genomic DNA. Anasymmetric donor DNA suitable for inclusion in an asymmetric donor DNAsystem of the present disclosure can include: a) a 5′ homology armhaving a length of from 85 nt to 95 nt; and b) a 3′ homology arm havinga length of from 20 nt to 50 nt, where the 3′ homology arm comprises atleast 20 consecutive nucleotides that are identical to 20 consecutivenucleotides of the target sequence of a target strand of genomic DNA.

An asymmetric donor DNA suitable for inclusion in an asymmetric donorDNA system of the present disclosure can include: a) a 5′ homology armhaving a length of from 85 nt to 95 nt; and b) a 3′ homology arm havinga length of from 25 nt to 40 nt, where the 3′ homology arm comprises atleast 5 consecutive nucleotides that are identical to 5 consecutivenucleotides of the target sequence of a target strand of genomic DNA. Anasymmetric donor DNA suitable for inclusion in an asymmetric donor DNAsystem of the present disclosure can include: a) a 5′ homology armhaving a length of from 85 nt to 95 nt; and b) a 3′ homology arm havinga length of from 25 nt to 40 nt, where the 3′ homology arm comprises atleast 10 consecutive nucleotides that are identical to 10 consecutivenucleotides of the target sequence of a target strand of genomic DNA. Anasymmetric donor DNA suitable for inclusion in an asymmetric donor DNAsystem of the present disclosure can include: a) a 5′ homology armhaving a length of from 85 nt to 95 nt; and b) a 3′ homology arm havinga length of from 25 nt to 40 nt, where the 3′ homology arm comprises atleast 20 consecutive nucleotides that are identical to 20 consecutivenucleotides of the target sequence of a target strand of genomic DNA.

An asymmetric donor DNA suitable for inclusion in an asymmetric donorDNA system of the present disclosure can include: a) a 5′ homology armhaving a length of from 90 nt to 100 nt; and b) a 3′ homology arm havinga length of from 25 nt to 40 nt, where the 3′ homology arm comprises atleast 5 consecutive nucleotides that are identical to 5 consecutivenucleotides of the target sequence of a target strand of genomic DNA. Anasymmetric donor DNA suitable for inclusion in an asymmetric donor DNAsystem of the present disclosure can include: a) a 5′ homology armhaving a length of from 90 nt to 100 nt; and b) a 3′ homology arm havinga length of from 25 nt to 40 nt, where the 3′ homology arm comprises atleast 10 consecutive nucleotides that are identical to 10 consecutivenucleotides of the target sequence of a target strand of genomic DNA. Anasymmetric donor DNA suitable for inclusion in an asymmetric donor DNAsystem of the present disclosure can include: a) a 5′ homology armhaving a length of from 90 nt to 100 nt; and b) a 3′ homology arm havinga length of from 25 nt to 40 nt, where the 3′ homology arm comprises atleast 20 consecutive nucleotides that are identical to 20 consecutivenucleotides of the target sequence of a target strand of genomic DNA.

In some cases, an asymmetric donor DNA of the present disclosure doesnot include any nucleotides between the 5′ and 3′ homology arms; inother words, in some cases, the 3′ homology arm is contiguous with andimmediately adjacent to the 5′ homology arm.

In some cases, an asymmetric donor DNA of the present disclosureincludes one or more nucleotides (e.g., a stretch of nucleotides)between the 5′ and 3′ homology arms, where the stretch of nucleotides isheterologous to the target DNA being edited. For example, in some cases,an asymmetric donor DNA of the present disclosure includes a stretch ofnucleotides between the 5′ and 3′ homology arms, where the stretch ofnucleotides is heterologous to the target DNA being edited, and wherethe stretch of nucleotides has a length of from 1 nucleotide (nt) to 500nt, e.g., from 1 nt to 5 nt, from 5 nt to 10 nt, from 10 nt to 25 nt,from 25 nt to 50 nt, from 50 nt to 100 nt, from 100 nt to 250 nt, orfrom 250 nt to 500 nt.

The donor sequence is typically not identical to the genomic sequencethat it replaces. Rather, the donor sequence may contain at least one ormore single base changes, insertions, deletions, inversions orrearrangements with respect to the genomic sequence, so long assufficient homology is present to support homology-directed repair. Insome embodiments, the donor sequence comprises a non-homologous sequenceflanked by two regions of homology, such that homology-directed repairbetween the target DNA region and the two flanking sequences results ininsertion of the non-homologous sequence at the target region. Donorsequences may also comprise a vector backbone containing sequences thatare not homologous to the DNA region of interest and that are notintended for insertion into the DNA region of interest. Generally, thehomologous region(s) of a donor sequence will have at least 50% sequenceidentity to a genomic sequence with which recombination is desired. Incertain embodiments, 60%, 70%, 80%, 90%, 95%, 98%, 99%, or 99.9%sequence identity is present. Any value between 1% and 100% sequenceidentity can be present, depending upon the length of the donorpolynucleotide.

The donor sequence may comprise certain sequence differences as comparedto the target genomic sequence, e.g. restriction sites, nucleotidepolymorphisms, selectable markers (e.g., drug resistance genes,fluorescent proteins, enzymes etc.), etc., which may be used to assessfor successful insertion of the donor sequence at the cleavage site orin some cases may be used for other purposes (e.g., to signifyexpression at the targeted genomic locus). In some cases, if located ina coding region, such nucleotide sequence differences will not changethe amino acid sequence, or will make silent amino acid changes (i.e.,changes which do not affect the structure or function of the protein).Alternatively, these sequences differences may include flankingrecombination sequences such as FLPs, loxP sequences, or the like, thatcan be activated at a later time for removal of the marker sequence.

Nucleic Acid Modifications

An asymmetric donor DNA of the present disclosure (for inclusion in anasymmetric donor DNA system of the present disclosure) can include oneor more modifications, e.g., a base modification, a backbonemodification, etc., to provide the nucleic acid with a new or enhancedfeature (e.g., improved stability, improved in vivo half life, etc.). Anucleoside is a base-sugar combination. The base portion of thenucleoside is normally a heterocyclic base. The two most common classesof such heterocyclic bases are the purines and the pyrimidines.Nucleotides are nucleosides that further include a phosphate groupcovalently linked to the sugar portion of the nucleoside. For thosenucleosides that include a pentofuranosyl sugar, the phosphate group canbe linked to the 2′, the 3′, or the 5′ hydroxyl moiety of the sugar. Informing oligonucleotides, the phosphate groups covalently link adjacentnucleosides to one another to form a linear polymeric compound. In turn,the respective ends of this linear polymeric compound can be furtherjoined to form a circular compound; however, linear compounds aresuitable. In addition, linear compounds may have internal nucleotidebase complementarity and may therefore fold in a manner as to produce afully or partially double-stranded compound. Within oligonucleotides,the phosphate groups are commonly referred to as forming theinternucleoside backbone of the oligonucleotide. The normal linkage orbackbone of RNA and DNA is a 3′ to 5′ phosphodiester linkage.

Suitable nucleic acid modifications include nucleoside modifications,sugar modifications, modified internucleoside linkages, and backbonemodifications.

Modified Backbones and Modified Internucleoside Linkages

Examples of suitable nucleic acids containing modifications includenucleic acids containing modified backbones or non-naturalinternucleoside linkages. Nucleic acids having modified backbonesinclude those that retain a phosphorus atom in the backbone and thosethat do not have a phosphorus atom in the backbone.

Suitable modified oligonucleotide backbones containing a phosphorus atomtherein include, for example, phosphorothioates, chiralphosphorothioates, phosphorodithioates, phosphotriesters,aminoalkylphosphotriesters, methyl and other alkyl phosphonatesincluding 3′-alkylene phosphonates, 5′-alkylene phosphonates and chiralphosphonates, phosphinates, phosphoramidates including 3′-aminophosphoramidate and aminoalkylphosphoramidates, phosphorodiamidates,thionophosphoramidates, thionoalkylphosphonates,thionoalkylphosphotriesters, selenophosphates and boranophosphateshaving normal 3′-5′ linkages, 2′-5′ linked analogs of these, and thosehaving inverted polarity wherein one or more internucleotide linkages isa 3′ to 3′, 5′ to 5′ or 2′ to 2′ linkage. Suitable oligonucleotideshaving inverted polarity comprise a single 3′ to 3′ linkage at the3′-most internucleotide linkage i.e. a single inverted nucleosideresidue which may be a basic (the nucleobase is missing or has ahydroxyl group in place thereof). Various salts (such as, for example,potassium or sodium), mixed salts and free acid forms are also included.

In some embodiments, a subject nucleic acid comprises one or morephosphorothioate and/or heteroatom internucleoside linkages, inparticular —CH₂—NH—O—CH₂—, —CH₂—N(CH₃)—O—CH₂— (known as a methylene(methylimino) or MMI backbone), —CH₂—O—N(CH₃)—CH₂—,—CH₂—N(CH₃)—N(CH₃)—CH₂— and —O—N(CH₃)—CH₂—CH₂— (wherein the nativephosphodiester internucleotide linkage is represented as—O—P(═O)(OH)—O—CH₂—). MMI type internucleoside linkages are disclosed inthe above referenced U.S. Pat. No. 5,489,677. Suitable amideinternucleoside linkages are disclosed in U.S. Pat. No. 5,602,240.

Also suitable are nucleic acids having morpholino backbone structures asdescribed in, e.g., U.S. Pat. No. 5,034,506. For example, in someembodiments, a subject nucleic acid comprises a 6-membered morpholinoring in place of a ribose ring. In some of these embodiments, aphosphorodiamidate or other non-phosphodiester internucleoside linkagereplaces a phosphodiester linkage.

Suitable modified polynucleotide backbones that do not include aphosphorus atom therein have backbones that are formed by short chainalkyl or cycloalkyl internucleoside linkages, mixed heteroatom and alkylor cycloalkyl internucleoside linkages, or one or more short chainheteroatomic or heterocyclic internucleoside linkages. These includethose having morpholino linkages (formed in part from the sugar portionof a nucleoside); siloxane backbones; sulfide, sulfoxide and sulfonebackbones; formacetyl and thioformacetyl backbones; methylene formacetyland thioformacetyl backbones; riboacetyl backbones; alkene containingbackbones; sulfamate backbones; methyleneimino and methylenehydrazinobackbones; sulfonate and sulfonamide backbones; amide backbones; andothers having mixed N, O, S and CH₂ component parts.

Mimetics

A subject nucleic acid can be a nucleic acid mimetic. The term “mimetic”as it is applied to polynucleotides is intended to includepolynucleotides wherein only the furanose ring or both the furanose ringand the internucleotide linkage are replaced with non-furanose groups,replacement of only the furanose ring is also referred to in the art asbeing a sugar surrogate. The heterocyclic base moiety or a modifiedheterocyclic base moiety is maintained for hybridization with anappropriate target nucleic acid. One such nucleic acid, a polynucleotidemimetic that has been shown to have excellent hybridization properties,is referred to as a peptide nucleic acid (PNA). In PNA, thesugar-backbone of a polynucleotide is replaced with an amide containingbackbone, in particular an aminoethylglycine backbone. The nucleotidesare retained and are bound directly or indirectly to aza nitrogen atomsof the amide portion of the backbone.

One polynucleotide mimetic that is suitable for use is a peptide nucleicacid (PNA). The backbone in PNA compounds is two or more linkedaminoethylglycine units which gives PNA an amide containing backbone.The heterocyclic base moieties are bound directly or indirectly to azanitrogen atoms of the amide portion of the backbone. Representative U.S.patents that describe the preparation of PNA compounds include, but arenot limited to: U.S. Pat. Nos. 5,539,082; 5,714,331; and 5,719,262.

Another class of polynucleotide mimetic that has been studied is basedon linked morpholino units (morpholino nucleic acid) having heterocyclicbases attached to the morpholino ring. A number of linking groups havebeen reported that link the morpholino monomeric units in a morpholinonucleic acid. One class of linking groups has been selected to give anon-ionic oligomeric compound. The non-ionic morpholino-based oligomericcompounds are less likely to have undesired interactions with cellularproteins. Morpholino-based polynucleotides are non-ionic mimics ofoligonucleotides which are less likely to form undesired interactionswith cellular proteins (Dwaine A. Braasch and David R. Corey,Biochemistry, 2002, 41(14), 4503-4510). Morpholino-based polynucleotidesare disclosed in U.S. Pat. No. 5,034,506. A variety of compounds withinthe morpholino class of polynucleotides have been prepared, having avariety of different linking groups joining the monomeric subunits.

A further class of polynucleotide mimetic is referred to as cyclohexenylnucleic acids (CeNA). The furanose ring normally present in a DNA/RNAmolecule is replaced with a cyclohexenyl ring. CeNA DMT protectedphosphoramidite monomers have been prepared and used for oligomericcompound synthesis following classical phosphoramidite chemistry. Fullymodified CeNA oligomeric compounds and oligonucleotides having specificpositions modified with CeNA have been prepared and studied (see Wang etal., J. Am. Chem. Soc., 2000, 122, 8595-8602). In general theincorporation of CeNA monomers into a DNA chain increases its stabilityof a DNA/RNA hybrid. CeNA oligoadenylates formed complexes with RNA andDNA complements with similar stability to the native complexes. Thestudy of incorporating CeNA structures into natural nucleic acidstructures was shown by NMR and circular dichroism to proceed with easyconformational adaptation.

A further modification includes Locked Nucleic Acids (LNAs) in which the2′-hydroxyl group is linked to the 4′ carbon atom of the sugar ringthereby forming a 2′-C,4′-C-oxymethylene linkage thereby forming abicyclic sugar moiety. The linkage can be a methylene (—CH₂—), groupbridging the 2′ oxygen atom and the 4′ carbon atom wherein n is 1 or 2(Singh et al., Chem. Commun., 1998, 4, 455-456). LNA and LNA analogsdisplay very high duplex thermal stabilities with complementary DNA andRNA (Tm=+3 to +10° C.), stability towards 3′-exonucleolytic degradationand good solubility properties. Potent and nontoxic antisenseoligonucleotides containing LNAs have been described (e.g., Wahlestedtet al., Proc. Natl. Acad. Sci. U.S.A., 2000, 97, 5633-5638).

The synthesis and preparation of the LNA monomers adenine, cytosine,guanine, 5-methyl-cytosine, thymine and uracil, along with theiroligomerization, and nucleic acid recognition properties have beendescribed (e.g., Koshkin et al., Tetrahedron, 1998, 54, 3607-3630). LNAsand preparation thereof are also described in WO 98/39352 and WO99/14226, as well as U.S. applications 20120165514, 20100216983,20090041809, 20060117410, 20040014959, 20020094555, and 20020086998.

Modified Sugar Moieties

A subject nucleic acid can also include one or more substituted sugarmoieties. Suitable polynucleotides comprise a sugar substituent groupselected from: OH; F; O-, S-, or N-alkyl; O-, S-, or N-alkenyl; O-, S-or N-alkynyl; or O-alkyl-O-alkyl, wherein the alkyl, alkenyl and alkynylmay be substituted or unsubstituted C₁ to C₁₀ alkyl or C₂ to C₁₀ alkenyland alkynyl. Particularly suitable are O((CH₂)_(n)O)_(m)CH₃,O(CH₂)_(n)OCH₃, O(CH₂)_(n)NH₂, O(CH₂)_(n)CH₃, O(CH₂)_(n)ONH₂, andO(CH₂)_(n)ON((CH₂)_(n)CH₃)₂, where n and m are from 1 to about 10. Othersuitable polynucleotides comprise a sugar substituent group selectedfrom: C₁ to C₁₀ lower alkyl, substituted lower alkyl, alkenyl, alkynyl,alkaryl, aralkyl, O-alkaryl or O-aralkyl, SH, SCH₃, OCN, Cl, Br, CN,CF₃, OCF₃, SOCH₃, SO₂CH₃, ONO₂, NO₂, N₃, NH₂, heterocycloalkyl,heterocycloalkaryl, aminoalkylamino, polyalkylamino, substituted silyl,an RNA cleaving group, a reporter group, an intercalator, a group forimproving the pharmacokinetic properties of an oligonucleotide, or agroup for improving the pharmacodynamic properties of anoligonucleotide, and other substituents having similar properties. Asuitable modification includes 2′-methoxyethoxy (2′-O—CH₂CH₂OCH₃, alsoknown as 2′-O-(2-methoxyethyl) or 2′-MOE) (Martin et al., Helv. Chim.Acta, 1995, 78, 486-504) i.e., an alkoxyalkoxy group. A further suitablemodification includes 2′-dimethylaminooxyethoxy, i.e., a O(CH₂)₂ON(CH₃)₂group, also known as 2′-DMAOE, as described in examples hereinbelow, and2′-dimethylaminoethoxyethoxy (also known in the art as2′-O-dimethyl-amino-ethoxy-ethyl or 2′-DMAEOE), i.e.,2′-O—CH₂—O—CH₂—N(CH₃)₂.

Other suitable sugar substituent groups include methoxy (—O—CH₃),aminopropoxy (—OCH₂CH₂CH₂NH₂), allyl (—CH₂—CH═CH₂), —O-allyl(—O—CH₂—CH═CH₂) and fluoro (F). 2′-sugar substituent groups may be inthe arabino (up) position or ribo (down) position. A suitable 2′-arabinomodification is 2′-F. Similar modifications may also be made at otherpositions on the oligomeric compound, particularly the 3′ position ofthe sugar on the 3′ terminal nucleoside or in 2′-5′ linkedoligonucleotides and the 5′ position of 5′ terminal nucleotide.Oligomeric compounds may also have sugar mimetics such as cyclobutylmoieties in place of the pentofuranosyl sugar.

Base Modifications and Substitutions

A subject nucleic acid may also include nucleobase (often referred to inthe art simply as “base”) modifications or substitutions. As usedherein, “unmodified” or “natural” nucleobases include the purine basesadenine (A) and guanine (G), and the pyrimidine bases thymine (T),cytosine (C) and uracil (U). Modified nucleobases include othersynthetic and natural nucleobases such as 5-methylcytosine (5-me-C),5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine,6-methyl and other alkyl derivatives of adenine and guanine, 2-propyland other alkyl derivatives of adenine and guanine, 2-thiouracil,2-thiothymine and 2-thiocytosine, 5-halouracil and cytosine, 5-propynyl(—C═C—CH₃) uracil and cytosine and other alkynyl derivatives ofpyrimidine bases, 6-azo uracil, cytosine and thymine, 5-uracil(pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thioalkyl,8-hydroxyl and other 8-substituted adenines and guanines, 5-haloparticularly 5-bromo, 5-trifluoromethyl and other 5-substituted uracilsand cytosines, 7-methylguanine and 7-methyladenine, 2-F-adenine,2-aminoadenine, 8-azaguanine and 8-azaadenine, 7-deazaguanine and7-deazaadenine and 3-deazaguanine and 3-deazaadenine. Further modifiednucleobases include tricyclic pyrimidines such as phenoxazinecytidine(1H-pyrimido(5,4-b)(1,4)benzoxazin-2(3H)-one), phenothiazinecytidine (1H-pyrimido(5,4-b)(1,4)benzothiazin-2(3H)-one), G-clamps suchas a substituted phenoxazine cytidine (e.g.9-(2-aminoethoxy)-H-pyrimido(5,4-(b) (1,4)benzoxazin-2(3H)-one),carbazole cytidine (2H-pyrimido(4,5-b)indol-2-one), pyridoindolecytidine (H-pyrido(3′,2′:4,5)pyrrolo(2,3-d)pyrimidin-2-one).

Heterocyclic base moieties may also include those in which the purine orpyrimidine base is replaced with other heterocycles, for example7-deaza-adenine, 7-deazaguanosine, 2-aminopyridine and 2-pyridone.Further nucleobases include those disclosed in U.S. Pat. No. 3,687,808,those disclosed in The Concise Encyclopedia Of Polymer Science AndEngineering, pages 858-859, Kroschwitz, J. I., ed. John Wiley & Sons,1990, those disclosed by Englisch et al., Angewandte Chemie,International Edition, 1991, 30, 613, and those disclosed by Sanghvi, Y.S., Chapter 15, Antisense Research and Applications, pages 289-302,Crooke, S. T. and Lebleu, B., ed., CRC Press, 1993. Certain of thesenucleobases are useful for increasing the binding affinity of anoligomeric compound. These include 5-substituted pyrimidines,6-azapyrimidines and N-2, N-6 and O-6 substituted purines, including2-aminopropyladenine, 5-propynyluracil and 5-propynylcytosine.5-methylcytosine substitutions have been shown to increase nucleic acidduplex stability by 0.6-1.2° C. (Sanghvi et al., eds., AntisenseResearch and Applications, CRC Press, Boca Raton, 1993, pp. 276-278) andare suitable base substitutions, e.g., when combined with2′-O-methoxyethyl sugar modifications.

Guide Nucleic Acid

A guide nucleic acid suitable for inclusion in a system of the presentdisclosure (e.g., an asymmetric donor DNA system of the presentdisclosure) directs the activities of an associated polypeptide (e.g., aCas9 polypeptide) to a specific target sequence within a target nucleicacid. A suitable guide nucleic acid comprises: a first segment (alsoreferred to herein as a “nucleic acid targeting segment”, or simply a“targeting segment”); and a second segment (also referred to herein as a“protein-binding segment”).

As noted above, in some embodiments, a subject guide nucleic acid (e.g.,a Cas9 guide RNA) comprises two separate nucleic acid molecules: an“activator” and a “targeter” and is referred to herein as a “dual guidenucleic acid”, a “double-molecule guide nucleic acid”, or a“two-molecule guide nucleic acid.” If both molecules of a dual guidenucleic acid are RNA molecules, the dual guide nucleic acid can bereferred to as a “dual guide RNA” or a “dgRNA.” In some embodiments, thesubject guide nucleic acid (Cas9 guide RNA) is a single nucleic acidmolecule (single polynucleotide) and is referred to herein as a “singleguide nucleic acid”, a “single-molecule guide nucleic acid,” or a“one-molecule guide nucleic acid.” If a single guide nucleic acid is anRNA molecule, it can be referred to as a “single guide RNA” or an“sgRNA” (e.g., a Cas9 single guide RNA). The terms “guide nucleic acid”and “guide RNA” are inclusive, referring to both dual guide nucleicacids and to single guide nucleic acids (e.g., dgRNAs and sgRNAs).

A dual guide nucleic acid comprises a crRNA-like (“CRISPR RNA” or“targeter” or “targeter RNA” or “crRNA” or “crRNA repeat”) molecule anda corresponding tracrRNA-like (“trans-acting CRISPR RNA” or “activator”or “activator RNA” or “tracrRNA”) molecule. In a single guide RNA, thetargeter RNA and activator RNA are linked together, e.g., withintervening nucleotides.

In some cases, a guide nucleic acid suitable for inclusion in a systemof the present disclosure (e.g., an asymmetric donor DNA system of thepresent disclosure) includes a nucleic acid modification. Suitablenucleic acid modifications include nucleoside modifications, sugarmodifications, modified internucleoside linkages, and backbonemodifications. Examples of suitable nucleic acid modifications are asdescribed above for the asymmetric donor DNA.

First Segment: Targeting Segment

The first segment of a guide nucleic acid comprises a nucleotidesequence that is complementary to a sequence (a target site) in a targetnucleic acid. In other words, the targeting segment of a subject guidenucleic acid can interact with a target nucleic acid (e.g., hybridizeswith the target strand of the double stranded genomic DNA) in asequence-specific manner via hybridization (i.e., base pairing) (e.g.,see FIGS. 14A-14B and 15A-15B). As such, the nucleotide sequence of thetargeting segment may vary and can determine the location within thetarget nucleic acid that the guide nucleic acid and the target nucleicacid will interact. The targeting segment of a guide nucleic acid can bemodified (e.g., by genetic engineering) to hybridize to any desiredsequence (target site) within a target nucleic acid.

The targeting segment can have a length of from about 12 nucleotides toabout 100 nucleotides. For example, the targeting segment can have alength of from about 12 nucleotides (nt) to about 80 nt, from about 12nt to about 50 nt, from about 12 nt to about 40 nt, from about 12 nt toabout 30 nt, from about 12 nt to about 25 nt, from about 12 nt to about20 nt, or from about 12 nt to about 19 nt. For example, the targetingsegment can have a length of from about 19 nt to about 20 nt, from about19 nt to about 25 nt, from about 19 nt to about 30 nt, from about 19 ntto about 35 nt, from about 19 nt to about 40 nt, from about 19 nt toabout 45 nt, from about 19 nt to about 50 nt, from about 19 nt to about60 nt, from about 19 nt to about 70 nt, from about 19 nt to about 80 nt,from about 19 nt to about 90 nt, from about 19 nt to about 100 nt, fromabout 20 nt to about 25 nt, from about 20 nt to about 30 nt, from about20 nt to about 35 nt, from about 20 nt to about 40 nt, from about 20 ntto about 45 nt, from about 20 nt to about 50 nt, from about 20 nt toabout 60 nt, from about 20 nt to about 70 nt, from about 20 nt to about80 nt, from about 20 nt to about 90 nt, or from about 20 nt to about 100nt.

The nucleotide sequence (the targeting sequence) of the targetingsegment that is complementary to a nucleotide sequence (target site) ofthe target nucleic acid can have a length of 12 nt or more. For example,the targeting sequence of the targeting segment that is complementary toa target site of the target nucleic acid can have a length of 12 nt ormore, 15 nt or more, 18 nt or more, 19 nt or more, 20 nt or more, 25 ntor more, 30 nt or more, 35 nt or more or 40 nt. For example, thetargeting sequence of the targeting segment that is complementary to atarget sequence of the target nucleic acid can have a length of fromabout 12 nucleotides (nt) to about 80 nt, from about 12 nt to about 50nt, from about 12 nt to about 45 nt, from about 12 nt to about 40 nt,from about 12 nt to about 35 nt, from about 12 nt to about 30 nt, fromabout 12 nt to about 25 nt, from about 12 nt to about 20 nt, from about12 nt to about 19 nt, from about 19 nt to about 20 nt, from about 19 ntto about 25 nt, from about 19 nt to about 30 nt, from about 19 nt toabout 35 nt, from about 19 nt to about 40 nt, from about 19 nt to about45 nt, from about 19 nt to about 50 nt, from about 19 nt to about 60 nt,from about 20 nt to about 25 nt, from about 20 nt to about 30 nt, fromabout 20 nt to about 35 nt, from about 20 nt to about 40 nt, from about20 nt to about 45 nt, from about 20 nt to about 50 nt, or from about 20nt to about 60 nt. The nucleotide sequence (the targeting sequence) ofthe targeting segment that is complementary to a nucleotide sequence(target site) of the target nucleic acid can have a length of 12 nt ormore.

In some cases, the targeting sequence of the targeting segment that iscomplementary to a target site of the target nucleic acid is 20nucleotides in length. In some cases, the targeting sequence of thetargeting segment that is complementary to a target site of the targetnucleic acid is 19 nucleotides in length.

The percent complementarity between the targeting sequence of thetargeting segment and the target site of the target nucleic acid can be60% or more (e.g., 65% or more, 70% or more, 75% or more, 80% or more,85% or more, 90% or more, 95% or more, 97% or more, 98% or more, 99% ormore, or 100%). In some cases, the percent complementarity between thetargeting sequence of the targeting segment and the target site of thetarget nucleic acid is 100% over the seven contiguous 5′-mostnucleotides of the target site of the target nucleic acid. In somecases, the percent complementarity between the targeting sequence of thetargeting segment and the target site of the target nucleic acid is 60%or more over about 20 contiguous nucleotides. In some cases, the percentcomplementarity between the targeting sequence of the targeting segmentand the target site of the target nucleic acid is 100% over the fourteencontiguous 5′-most nucleotides of the target site of the target nucleicacid and as low as 0% or more over the remainder. In such a case, thetargeting sequence can be considered to be 14 nucleotides in length. Insome cases, the percent complementarity between the targeting sequenceof the targeting segment and the target site of the target nucleic acidis 100% over the seven contiguous 5′-most nucleotides of the target siteof the target nucleic acid and as low as 0% or more over the remainder.In such a case, the targeting sequence can be considered to be 20nucleotides in length.

Second Segment: Protein-Binding Segment

The protein-binding segment of a guide nucleic acid (e.g., Cas9 guideRNA) interacts with a Cas9 polypeptide. A guide nucleic acid guides thebound polypeptide to a specific nucleotide sequence within targetnucleic acid via the above mentioned targeting segment. Theprotein-binding segment of a suitable guide nucleic acid (e.g., Cas9guide RNA) comprises two stretches of nucleotides that are complementaryto one another. The complementary nucleotides of the protein-bindingsegment hybridize to form a double stranded RNA duplex (dsRNA) (e.g.,see FIG. 16A-16C).

A subject dual guide nucleic acid (e.g., Cas9 dual guide RNA) comprisestwo separate nucleic acid molecules. Each of the two molecules of asubject dual guide nucleic acid comprises a stretch of nucleotides thatare complementary to one another such that the complementary nucleotidesof the two molecules hybridize to form the double stranded RNA duplex ofthe protein-binding segment (e.g., see FIG. 16A). A subject single guidenucleic acid (e.g., Cas9 single guide RNA) includes two stretches ofnucleotides (corresponding to the activator RNA and the targeter RNA ofa dual guide RNA) that each include a region (a duplex-forming segment)that is complementary to one another such that the complementarynucleotides of the two duplex-forming segments hybridize to form thedouble stranded RNA duplex of the protein-binding segment (e.g., seeFIG. 16B and FIG. 16C, e.g., this region is sometimes referred to as the“stem loop”).

In some cases, the protein-binding segment includes stem loop 1 (the“nexus”) of a Cas9 guide RNA (e.g., see FIG. 16C). For example, in somecases, the activator RNA of a Cas9 guide RNA (dgRNA or sgRNA) includes(i) a duplex forming segment that contributes to the dsRNA duplex of theprotein-binding segment; and (ii) nucleotides 3′ of the duplex formingsegment, e.g., that form stem loop 1 (the “nexus”). For example, in somecases, the protein-binding segment includes stem loop 1 (the “nexus”) ofa Cas9 guide RNA. In some cases, a Cas9 guide RNA includes 5 or morenucleotides (nt) (e.g., 6 or more, 7 or more, 8 or more, 9 or more, 10or more, 11 or more, 12 or more, 15 or more, 20 or more, 30 or more, 40or more, 50 or more, 60 or more, 70 or more, 75 or more, or 80 or morent) (e.g., of wild type tracr sequence) 3′ of the dsRNA duplex (where 3′is relative to the duplex-forming segment of the activator sequence).

The dsRNA duplex of the guide RNA (sgRNA or dgRNA) that forms betweenthe activator and targeter is sometimes referred to herein as the “stemloop”. In addition, the activator (activator RNA, tracrRNA) of manynaturally existing Cas9 guide RNAs (e.g., S. pyogenes guide RNAs) has 3stem loops (3 hairpins) that are 3′ of the duplex-forming segment of theactivator. The closest stem loop to the duplex-forming segment of theactivator (3′ of the duplex forming segment) is called “stem loop 1”(and is also referred to herein as the “nexus”); the next stem loop iscalled “stem loop 2” (and is also referred to herein as the “hairpin1”); and the next stem loop is called “stem loop 3” (and is alsoreferred to herein as the “hairpin 2”). For example, see FIG. 16C forclarification of the nomenclature.

In some cases, an activator RNA (of a Cas9 guide RNA) has stem loop 1,but does not have stem loop 2 and does not have stem loop 3. In somecases, an activator (of a Cas9 guide RNA) has stem loop 1 and stem loop2, but does not have stem loop 3. In some cases, an activator (of a Cas9guide RNA) has stem loops 1, 2, and 3.

In some cases, the activator RNA (e.g., tracr sequence) of a Cas9 guideRNA (dgRNA or sgRNA) includes (i) a duplex forming segment thatcontributes to the dsRNA duplex of the protein-binding segment; and (ii)nucleotides 3′ of the duplex forming segment (and therefore the Cas9guide RNA includes (ii)). In some cases, the additional nucleotides 3′of the duplex forming segment form stem loop 1. In some cases, theactivator (e.g., tracr sequence) of a Cas9 guide RNA (dgRNA or sgRNA)includes (i) a duplex forming segment that contributes to the dsRNAduplex of the protein-binding segment; and (ii) 5 or more nucleotides(e.g., 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 ormore, 12 or more, 13 or more, 14 or more, 15 or more, 20 or more, 25 ormore, 30 or more, 35 or more, 40 or more, 45 or more, 50 or more, 60 ormore, 70 or more, or 75 or more nucleotides) (e.g., in some cases havingwild type tracr sequence) 3′ of the duplex forming segment (andtherefore the Cas9 guide RNA includes (ii)). In some cases, theactivator RNA of a Cas9 guide RNA (dgRNA or sgRNA) includes (i) a duplexforming segment that contributes to the dsRNA duplex of theprotein-binding segment; and (ii) 5 or more nucleotides (e.g., 6 ormore, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, 12 ormore, 13 or more, 14 or more, 15 or more, 20 or more, 25 or more, 30 ormore, 35 or more, 40 or more, 45 or more, 50 or more, 60 or more, 70 ormore, or 75 or more nucleotides) (e.g., in some cases having wild typetracr sequence) 3′ of the duplex forming segment (and therefore the Cas9guide RNA includes (ii)).

In some cases, the activator (e.g., tracr sequence) of a Cas9 guide RNA(dgRNA or sgRNA) includes (i) a duplex forming segment that contributesto the dsRNA duplex of the protein-binding segment; and (ii) a stretchof nucleotides (e.g., referred to herein as a 3′ tail) (e.g., in somecases having wild type tracr sequence) 3′ of the duplex forming segment(and therefore the Cas9 guide RNA includes (ii)).

In some cases, the stretch of nucleotides 3′ of the duplex formingsegment has a length in a range of from 5 to 200 nucleotides (nt) (e.g.,from 5 to 150 nt, from 5 to 130 nt, from 5 to 120 nt, from 5 to 100 nt,from 5 to 80 nt, from 10 to 200 nt, from 10 to 150 nt, from 10 to 130nt, from 10 to 120 nt, from 10 to 100 nt, from 10 to 80 nt, from 12 to200 nt, from 12 to 150 nt, from 12 to 130 nt, from 12 to 120 nt, from 12to 100 nt, from 12 to 80 nt, from 15 to 200 nt, from 15 to 150 nt, from15 to 130 nt, from 15 to 120 nt, from 15 to 100 nt, from 15 to 80 nt,from 20 to 200 nt, from 20 to 150 nt, from 20 to 130 nt, from 20 to 120nt, from 20 to 100 nt, from 20 to 80 nt, from 30 to 200 nt, from 30 to150 nt, from 30 to 130 nt, from 30 to 120 nt, from 30 to 100 nt, or from30 to 80 nt). In some cases the stretch is wild type tracr sequence.

Non-limiting examples of nucleotide sequences that can be included in adual or single guide nucleic acid (e.g., dual guide RNA or single guideRNA) include either of the sequences set forth in SEQ ID NOs: 827-957,or complements thereof pairing with any sequences set forth in SEQ IDNOs: 962-1075, or complements thereof that can hybridize to form aprotein binding segment.

A subject single guide nucleic acid comprises two stretches ofnucleotides (much like a “targeter” and an “activator” of a dual guidenucleic acid) that are complementary to one another, hybridize to formthe double stranded RNA duplex (dsRNA duplex) of the protein-bindingsegment (thus resulting in a stem-loop structure), and are covalentlylinked (e.g., by intervening nucleotides—“linkers” or “linkernucleotides”). Thus, a subject single guide nucleic acid (e.g., a singleguide RNA) can comprise a targeter and an activator, each having aduplex-forming segment, where the duplex-forming segments of thetargeter and the activator hybridize with one another to form a dsRNAduplex. The targeter and the activator can be covalently linked via the3′ end of the targeter and the 5′ end of the activator (e.g., see FIG.16B and FIG. 16C). Alternatively, targeter and the activator can becovalently linked via the 5′ end of the targeter and the 3′ end of theactivator.

A linker of a single guide nucleic acid can have a length of from about3 nucleotides to about 100 nucleotides. For example, the linker can havea length of from about 3 nucleotides (nt) to about 90 nt, from about 3nucleotides (nt) to about 80 nt, from about 3 nucleotides (nt) to about70 nt, from about 3 nucleotides (nt) to about 60 nt, from about 3nucleotides (nt) to about 50 nt, from about 3 nucleotides (nt) to about40 nt, from about 3 nucleotides (nt) to about 30 nt, from about 3nucleotides (nt) to about 20 nt or from about 3 nucleotides (nt) toabout 10 nt. For example, the linker can have a length of from about 3nt to about 5 nt, from about 5 nt to about 10 nt, from about 10 nt toabout 15 nt, from about 15 nt to about 20 nt, from about 20 nt to about25 nt, from about 25 nt to about 30 nt, from about 30 nt to about 35 nt,from about 35 nt to about 40 nt, from about 40 nt to about 50 nt, fromabout 50 nt to about 60 nt, from about 60 nt to about 70 nt, from about70 nt to about 80 nt, from about 80 nt to about 90 nt, or from about 90nt to about 100 nt. In some embodiments, the linker of a single guidenucleic acid is 4 nt.

An example single guide nucleic acid comprises two complementarystretches of nucleotides that hybridize to form a dsRNA duplex. In someembodiments, one of the two complementary stretches of nucleotides ofthe single guide nucleic acid (or the DNA encoding the stretch) is 60%or more identical to one of the activator (tracrRNA) molecules set forthin SEQ ID NOs: 827-957 (which are from tracrRNAs), or a complementthereof, over a stretch of 8 or more contiguous nucleotides (e.g., 8 ormore contiguous nucleotides, 10 or more contiguous nucleotides, 12 ormore contiguous nucleotides, 15 or more contiguous nucleotides, or 20 ormore contiguous nucleotides). For example, one of the two complementarystretches of nucleotides of the single guide nucleic acid (or the DNAencoding the stretch) is 65% or more identical, 70% or more identical,75% or more identical, 80% or more identical, 85% or more identical, 90%or more identical, 95% or more identical, 98% or more identical, 99% ormore identical or 100% identical to one of the tracrRNA sequences setforth in SEQ ID NOs: 827-957, or a complement thereof, over a stretch of8 or more contiguous nucleotides (e.g., 8 or more contiguousnucleotides, 10 or more contiguous nucleotides, 12 or more contiguousnucleotides, 15 or more contiguous nucleotides, or 20 or more contiguousnucleotides).

In some embodiments, one of the two complementary stretches ofnucleotides of the single guide nucleic acid (or the DNA encoding thestretch) is 60% or more identical to one of the targeter (crRNA)sequences set forth in SEQ ID NOs: 962-1075 (which are from crRNAs), ora complement thereof, over a stretch of 8 or more contiguous nucleotides(e.g., 8 or more contiguous nucleotides, 10 or more contiguousnucleotides, 12 or more contiguous nucleotides, 15 or more contiguousnucleotides, or 20 or more contiguous nucleotides). For example, one ofthe two complementary stretches of nucleotides of the single guidenucleic acid (or the DNA encoding the stretch) is 65% or more identical,70% or more identical, 75% or more identical, 80% or more identical, 85%or more identical, 90% or more identical, 95% or more identical, 98% ormore identical, 99% or more identical or 100% identical to one of thecrRNA sequences set forth in SEQ ID NOs: 962-1075, or a complementthereof, over a stretch of 8 or more contiguous nucleotides (e.g., 8 ormore contiguous nucleotides, 10 or more contiguous nucleotides, 12 ormore contiguous nucleotides, 15 or more contiguous nucleotides, or 20 ormore contiguous nucleotides).

In some embodiments, one of the two complementary stretches ofnucleotides of the single guide nucleic acid (or the DNA encoding thestretch) is 60% or more identical to one of the targeter (crRNA)sequences set forth in SEQ ID NOs: 962-1075 (which are from crRNAs), ora complement thereof, over a stretch of 8 or more contiguous nucleotides(e.g., 8 or more contiguous nucleotides, 10 or more contiguousnucleotides, 12 or more contiguous nucleotides, 15 or more contiguousnucleotides, or 20 or more contiguous nucleotides) and the other of thetwo complementary stretches of nucleotides of the single guide nucleicacid (or the DNA encoding the stretch) is 60% or more identical to oneof the activator (tracrRNA) molecules set forth in SEQ ID NOs: 827-957(which are from tracrRNAs), or a complement thereof, over a stretch of 8or more contiguous nucleotides (e.g., 8 or more contiguous nucleotides,10 or more contiguous nucleotides, 12 or more contiguous nucleotides, 15or more contiguous nucleotides, or 20 or more contiguous nucleotides).For example, in some cases, one of the two complementary stretches ofnucleotides of the single guide nucleic acid (or the DNA encoding thestretch) is 65% or more identical, 70% or more identical, 75% or moreidentical, 80% or more identical, 85% or more identical, 90% or moreidentical, 95% or more identical, 98% or more identical, 99% or moreidentical or 100% identical to one of the crRNA sequences set forth inSEQ ID NOs: 962-1075, or a complement thereof, over a stretch of 8 ormore contiguous nucleotides (e.g., 8 or more contiguous nucleotides, 10or more contiguous nucleotides, 12 or more contiguous nucleotides, 15 ormore contiguous nucleotides, or 20 or more contiguous nucleotides) andthe other of the two complementary stretches of nucleotides of thesingle guide nucleic acid (or the DNA encoding the stretch) is 65% ormore identical, 70% or more identical, 75% or more identical, 80% ormore identical, 85% or more identical, 90% or more identical, 95% ormore identical, 98% or more identical, 99% or more identical or 100%identical to one of the tracrRNA sequences set forth in SEQ ID NOs:827-957, or a complement thereof, over a stretch of 8 or more contiguousnucleotides (e.g., 8 or more contiguous nucleotides, 10 or morecontiguous nucleotides, 12 or more contiguous nucleotides, 15 or morecontiguous nucleotides, or 20 or more contiguous nucleotides).

Appropriate cognate pairs of targeters and activators can be routinelydetermined for SEQ ID NOs: 827-957 and 962-1075 by taking into accountthe species name and base-pairing (for the dsRNA duplex of theprotein-binding domain). Any activator/targeter pair can be used as partof subject dual guide nucleic acid or as part of a subject single guidenucleic acid.

In some cases, an activator (e.g., a trRNA, trRNA-like molecule, etc.)of a dual guide nucleic acid (e.g., a dual guide RNA) or a single guidenucleic acid (e.g., a single guide RNA) includes a stretch ofnucleotides with 60% or more sequence identity (e.g., 65% or more, 70%or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% ormore, 98% or more, or 100% sequence identity) with an activator(tracrRNA) molecule set forth in any one of SEQ ID NOs: 827-957, or acomplement thereof. In some cases, an activator (e.g., a trRNA,trRNA-like molecule, etc.) of a dual guide nucleic acid (e.g., a dualguide RNA) or a single guide nucleic acid (e.g., a single guide RNA)includes a stretch of nucleotides with 70% or more sequence identitywith an activator (tracrRNA) molecule set forth in any one of SEQ IDNOs: 827-957, or a complement thereof. In some cases, an activator(e.g., a trRNA, trRNA-like molecule, etc.) of a dual guide nucleic acid(e.g., a dual guide RNA) or a single guide nucleic acid (e.g., a singleguide RNA) includes a stretch of nucleotides with 75% or more sequenceidentity with an activator (tracrRNA) molecule set forth in any one ofSEQ ID NOs: 827-957, or a complement thereof. In some cases, anactivator (e.g., a trRNA, trRNA-like molecule, etc.) of a dual guidenucleic acid (e.g., a dual guide RNA) or a single guide nucleic acid(e.g., a single guide RNA) includes a stretch of nucleotides with 80% ormore sequence identity with an activator (tracrRNA) molecule set forthin any one of SEQ ID NOs: 827-957, or a complement thereof. In somecases, an activator (e.g., a trRNA, trRNA-like molecule, etc.) of a dualguide nucleic acid (e.g., a dual guide RNA) or a single guide nucleicacid (e.g., a single guide RNA) includes a stretch of nucleotides with85% or more sequence identity with an activator (tracrRNA) molecule setforth in any one of SEQ ID NOs: 827-957, or a complement thereof. Insome cases, an activator (e.g., a trRNA, trRNA-like molecule, etc.) of adual guide nucleic acid (e.g., a dual guide RNA) or a single guidenucleic acid (e.g., a single guide RNA) includes a stretch ofnucleotides with 90% or more sequence identity with an activator(tracrRNA) molecule set forth in any one of SEQ ID NOs: 827-957, or acomplement thereof. In some cases, an activator (e.g., a trRNA,trRNA-like molecule, etc.) of a dual guide nucleic acid (e.g., a dualguide RNA) or a single guide nucleic acid (e.g., a single guide RNA)includes a stretch of nucleotides with 95% or more sequence identitywith an activator (tracrRNA) molecule set forth in any one of SEQ IDNOs: 827-957, or a complement thereof. In some cases, an activator(e.g., a trRNA, trRNA-like molecule, etc.) of a dual guide nucleic acid(e.g., a dual guide RNA) or a single guide nucleic acid (e.g., a singleguide RNA) includes a stretch of nucleotides with 98% or more sequenceidentity with an activator (tracrRNA) molecule set forth in any one ofSEQ ID NOs: 827-957, or a complement thereof. In some cases, anactivator (e.g., a trRNA, trRNA-like molecule, etc.) of a dual guidenucleic acid (e.g., a dual guide RNA) or a single guide nucleic acid(e.g., a single guide RNA) includes a stretch of nucleotides with 100%sequence identity with an activator (tracrRNA) molecule set forth in anyone of SEQ ID NOs: 827-957, or a complement thereof.

In some embodiments, the duplex-forming segment of the activator (of adual guide RNA or a single guide RNA) is 60% or more identical to one ofthe activator (tracrRNA) molecules set forth in SEQ ID NOs: 827-957, ora complement thereof, over a stretch of 8 or more contiguous nucleotides(e.g., 8 or more contiguous nucleotides, 10 or more contiguousnucleotides, 12 or more contiguous nucleotides, 15 or more contiguousnucleotides, or 20 or more contiguous nucleotides). For example, theduplex-forming segment of the activator (or the DNA encoding theduplex-forming segment of the activator) can be 65% or more identical toone of the tracrRNA sequences set forth in SEQ ID NOs: 827-957, or acomplement thereof, over a stretch of 8 or more contiguous nucleotides(e.g., 8 or more contiguous nucleotides, 10 or more contiguousnucleotides, 12 or more contiguous nucleotides, 15 or more contiguousnucleotides, or 20 or more contiguous nucleotides).

The duplex-forming segment of the activator (or the DNA encoding theduplex-forming segment of the activator) (of a dual guide RNA or asingle guide RNA) can be 70% or more identical to one of the tracrRNAsequences set forth in SEQ ID NOs: 827-957, or a complement thereof,over a stretch of 8 or more contiguous nucleotides (e.g., 8 or morecontiguous nucleotides, 10 or more contiguous nucleotides, 12 or morecontiguous nucleotides, 15 or more contiguous nucleotides, or 20 or morecontiguous nucleotides).

The duplex-forming segment of the activator (or the DNA encoding theduplex-forming segment of the activator) (of a dual guide RNA or asingle guide RNA) can be 75% or more identical to one of the tracrRNAsequences set forth in SEQ ID NOs: 827-957, or a complement thereof,over a stretch of 8 or more contiguous nucleotides (e.g., 8 or morecontiguous nucleotides, 10 or more contiguous nucleotides, 12 or morecontiguous nucleotides, 15 or more contiguous nucleotides, or 20 or morecontiguous nucleotides).

The duplex-forming segment of the activator (or the DNA encoding theduplex-forming segment of the activator) (of a dual guide RNA or asingle guide RNA) can be 80% or more identical to one of the tracrRNAsequences set forth in SEQ ID NOs: 827-957, or a complement thereof,over a stretch of 8 or more contiguous nucleotides (e.g., 8 or morecontiguous nucleotides, 10 or more contiguous nucleotides, 12 or morecontiguous nucleotides, 15 or more contiguous nucleotides, or 20 or morecontiguous nucleotides).

The duplex-forming segment of the activator (or the DNA encoding theduplex-forming segment of the activator) (of a dual guide RNA or asingle guide RNA) can be 85% or more identical to one of the tracrRNAsequences set forth in SEQ ID NOs: 827-957, or a complement thereof,over a stretch of 8 or more contiguous nucleotides (e.g., 8 or morecontiguous nucleotides, 10 or more contiguous nucleotides, 12 or morecontiguous nucleotides, 15 or more contiguous nucleotides, or 20 or morecontiguous nucleotides).

The duplex-forming segment of the activator (or the DNA encoding theduplex-forming segment of the activator) (of a dual guide RNA or asingle guide RNA) can be 90% or more identical to one of the tracrRNAsequences set forth in SEQ ID NOs: 827-957, or a complement thereof,over a stretch of 8 or more contiguous nucleotides (e.g., 8 or morecontiguous nucleotides, 10 or more contiguous nucleotides, 12 or morecontiguous nucleotides, 15 or more contiguous nucleotides, or 20 or morecontiguous nucleotides).

The duplex-forming segment of the activator (or the DNA encoding theduplex-forming segment of the activator) (of a dual guide RNA or asingle guide RNA) can be 95% or more identical to one of the tracrRNAsequences set forth in SEQ ID NOs: 827-957, or a complement thereof,over a stretch of 8 or more contiguous nucleotides (e.g., 8 or morecontiguous nucleotides, 10 or more contiguous nucleotides, 12 or morecontiguous nucleotides, 15 or more contiguous nucleotides, or 20 or morecontiguous nucleotides).

The duplex-forming segment of the activator (or the DNA encoding theduplex-forming segment of the activator) (of a dual guide RNA or asingle guide RNA) can be 98% or more identical to one of the tracrRNAsequences set forth in SEQ ID NOs: 827-957, or a complement thereof,over a stretch of 8 or more contiguous nucleotides (e.g., 8 or morecontiguous nucleotides, 10 or more contiguous nucleotides, 12 or morecontiguous nucleotides, 15 or more contiguous nucleotides, or 20 or morecontiguous nucleotides).

The duplex-forming segment of the activator (or the DNA encoding theduplex-forming segment of the activator) (of a dual guide RNA or asingle guide RNA) can be 99% or more identical to one of the tracrRNAsequences set forth in SEQ ID NOs: 827-957, or a complement thereof,over a stretch of 8 or more contiguous nucleotides (e.g., 8 or morecontiguous nucleotides, 10 or more contiguous nucleotides, 12 or morecontiguous nucleotides, 15 or more contiguous nucleotides, or 20 or morecontiguous nucleotides).

The duplex-forming segment of the activator (or the DNA encoding theduplex-forming segment of the activator) (of a dual guide RNA or asingle guide RNA) can be 100% identical to one of the tracrRNA sequencesset forth in SEQ ID NOs: 827-957, or a complement thereof, over astretch of 8 or more contiguous nucleotides (e.g., 8 or more contiguousnucleotides, 10 or more contiguous nucleotides, 12 or more contiguousnucleotides, 15 or more contiguous nucleotides, or 20 or more contiguousnucleotides).

In some embodiments, the duplex-forming segment of the targeter (or theDNA encoding the duplex-forming segment of the targeter) (of a dualguide RNA or a single guide RNA) is 60% or more identical to one of thetargeter (crRNA) sequences set forth in SEQ ID NOs: 962-1075, or acomplement thereof, over a stretch of 8 or more contiguous nucleotides(e.g., 8 or more contiguous nucleotides, 10 or more contiguousnucleotides, 12 or more contiguous nucleotides, 15 or more contiguousnucleotides, or 20 or more contiguous nucleotides). For example, theduplex-forming segment of the targeter (or the DNA encoding theduplex-forming segment of the targeter) (of a dual guide RNA or a singleguide RNA) can be 65% or more identical to one of the crRNA sequencesset forth in SEQ ID NOs: 962-1075, or a complement thereof, over astretch of 8 or more contiguous nucleotides (e.g., 8 or more contiguousnucleotides, 10 or more contiguous nucleotides, 12 or more contiguousnucleotides, 15 or more contiguous nucleotides, or 20 or more contiguousnucleotides).

The duplex-forming segment of the targeter (or the DNA encoding theduplex-forming segment of the targeter) can be 70% or more identical toone of the crRNA sequences set forth in SEQ ID NOs: 962-1075, or acomplement thereof, over a stretch of 8 or more contiguous nucleotides(e.g., 8 or more contiguous nucleotides, 10 or more contiguousnucleotides, 12 or more contiguous nucleotides, 15 or more contiguousnucleotides, or 20 or more contiguous nucleotides).

The duplex-forming segment of the targeter (or the DNA encoding theduplex-forming segment of the targeter) (of a dual guide RNA or a singleguide RNA) can be 75% or more identical to one of the crRNA sequencesset forth in SEQ ID NOs: 962-1075, or a complement thereof, over astretch of 8 or more contiguous nucleotides (e.g., 8 or more contiguousnucleotides, 10 or more contiguous nucleotides, 12 or more contiguousnucleotides, 15 or more contiguous nucleotides, or 20 or more contiguousnucleotides).

The duplex-forming segment of the targeter (or the DNA encoding theduplex-forming segment of the targeter) (of a dual guide RNA or a singleguide RNA) can be 80% or more identical to one of the crRNA sequencesset forth in SEQ ID NOs: 962-1075, or a complement thereof, over astretch of 8 or more contiguous nucleotides (e.g., 8 or more contiguousnucleotides, 10 or more contiguous nucleotides, 12 or more contiguousnucleotides, 15 or more contiguous nucleotides, or 20 or more contiguousnucleotides).

The duplex-forming segment of the targeter (or the DNA encoding theduplex-forming segment of the targeter) (of a dual guide RNA or a singleguide RNA) can be 85% or more identical to one of the crRNA sequencesset forth in SEQ ID NOs: 962-1075, or a complement thereof, over astretch of 8 or more contiguous nucleotides (e.g., 8 or more contiguousnucleotides, 10 or more contiguous nucleotides, 12 or more contiguousnucleotides, 15 or more contiguous nucleotides, or 20 or more contiguousnucleotides).

The duplex-forming segment of the targeter (or the DNA encoding theduplex-forming segment of the targeter) (of a dual guide RNA or a singleguide RNA) can be 90% or more identical to one of the crRNA sequencesset forth in SEQ ID NOs: 962-1075, or a complement thereof, over astretch of 8 or more contiguous nucleotides (e.g., 8 or more contiguousnucleotides, 10 or more contiguous nucleotides, 12 or more contiguousnucleotides, 15 or more contiguous nucleotides, or 20 or more contiguousnucleotides).

The duplex-forming segment of the targeter (or the DNA encoding theduplex-forming segment of the targeter) (of a dual guide RNA or a singleguide RNA) can be 95% or more identical to one of the crRNA sequencesset forth in SEQ ID NOs: 962-1075, or a complement thereof, over astretch of 8 or more contiguous nucleotides (e.g., 8 or more contiguousnucleotides, 10 or more contiguous nucleotides, 12 or more contiguousnucleotides, 15 or more contiguous nucleotides, or 20 or more contiguousnucleotides).

The duplex-forming segment of the targeter (or the DNA encoding theduplex-forming segment of the targeter) (of a dual guide RNA or a singleguide RNA) can be 98% or more identical to one of the crRNA sequencesset forth in SEQ ID NOs: 962-1075, or a complement thereof, over astretch of 8 or more contiguous nucleotides (e.g., 8 or more contiguousnucleotides, 10 or more contiguous nucleotides, 12 or more contiguousnucleotides, 15 or more contiguous nucleotides, or 20 or more contiguousnucleotides).

The duplex-forming segment of the targeter (or the DNA encoding theduplex-forming segment of the targeter) (of a dual guide RNA or a singleguide RNA) can be 99% or more identical to one of the crRNA sequencesset forth in SEQ ID NOs: 962-1075, or a complement thereof, over astretch of 8 or more contiguous nucleotides (e.g., 8 or more contiguousnucleotides, 10 or more contiguous nucleotides, 12 or more contiguousnucleotides, 15 or more contiguous nucleotides, or 20 or more contiguousnucleotides).

The duplex-forming segment of the targeter (or the DNA encoding theduplex-forming segment of the targeter) (of a dual guide RNA or a singleguide RNA) can be 100% identical to one of the crRNA sequences set forthin SEQ ID NOs: 962-1075, or a complement thereof, over a stretch of 8 ormore contiguous nucleotides (e.g., 8 or more contiguous nucleotides, 10or more contiguous nucleotides, 12 or more contiguous nucleotides, 15 ormore contiguous nucleotides, or 20 or more contiguous nucleotides).

In some cases, an activator (e.g., a trRNA, trRNA-like molecule, etc.)of a dual guide nucleic acid (e.g., a dual guide RNA) or a single guidenucleic acid (e.g., a single guide RNA) includes 30 or more nucleotides(nt) (e.g., 40 or more, 50 or more, 60 or more, 70 or more, 75 or morent) (e.g., of a wild type Cas9 guide RNA). In some cases, an activator(e.g., a trRNA, trRNA-like molecule, etc.) of a dual guide nucleic acid(e.g., a dual guide RNA) or a single guide nucleic acid (e.g., a singleguide RNA) has a length in a range of from 30 to 200 nucleotides (nt)(e.g., 40 to 200 nucleotides, 50 to 200 nucleotides, 60 to 200nucleotides, 65 to 200 nucleotides, 70 to 200 nucleotides, 75 to 200nucleotides, 40 to 150 nucleotides, 50 to 150 nucleotides, 60 to 150nucleotides, 65 to 150 nucleotides, 70 to 150 nucleotides, 75 to 150nucleotides, 40 to 100 nucleotides, 50 to 100 nucleotides, 60 to 100nucleotides, 65 to 100 nucleotides, 70 to 100 nucleotides, or 75 to 100nucleotides).

With regard to both a single guide nucleic acid and to a dual guidenucleic acid, the dsRNA duplex of the protein-binding segment can have alength from about 6 base pairs (bp) to about 50bp. For example, thedsRNA duplex of the protein-binding segment can have a length from about6 bp to about 40 bp, from about 6 bp to about 30bp, from about 6 bp toabout 25 bp, from about 6 bp to about 20 bp, from about 6 bp to about 15bp, from about 8 bp to about 40 bp, from about 8 bp to about 30bp, fromabout 8 bp to about 25 bp, from about 8 bp to about 20 bp or from about8 bp to about 15 bp. For example, the dsRNA duplex of theprotein-binding segment can have a length from about from about 8 bp toabout 10 bp, from about 10 bp to about 15 bp, from about 15 bp to about18 bp, from about 18 bp to about 20 bp, from about 20 bp to about 25 bp,from about 25 bp to about 30 bp, from about 30 bp to about 35 bp, fromabout 35 bp to about 40 bp, or from about 40 bp to about 50 bp. In someembodiments, the dsRNA duplex of the protein-binding segment has alength of 36 base pairs. The percent complementarity between thenucleotide sequences that hybridize to form the dsRNA duplex of theprotein-binding segment can be 60% or more. For example, the percentcomplementarity between the nucleotide sequences that hybridize to formthe dsRNA duplex of the protein-binding segment can be 65% or more, 70%or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% ormore, 98% or more, or 99% or more. In some cases, the percentcomplementarity between the nucleotide sequences that hybridize to formthe dsRNA duplex of the protein-binding segment is 100%.

Stability Control Sequence (e.g., Transcriptional Terminator Segment)

In some embodiments, a guide nucleic acid comprises a stability controlsequence. A stability control sequence influences the stability of anucleic acid (e.g., a guide nucleic acid, a targeter, an activator,etc.). One example of a suitable stability control sequence for use withan RNA is a transcriptional terminator segment (i.e., a transcriptiontermination sequence). A transcriptional terminator segment of a subjectguide nucleic acid can have a total length of from about 10 nucleotidesto about 100 nucleotides, e.g., from about 10 nucleotides (nt) to about20 nt, from about 20 nt to about 30 nt, from about 30 nt to about 40 nt,from about 40 nt to about 50 nt, from about 50 nt to about 60 nt, fromabout 60 nt to about 70 nt, from about 70 nt to about 80 nt, from about80 nt to about 90 nt, or from about 90 nt to about 100 nt. For example,the transcriptional terminator segment can have a length of from about15 nucleotides (nt) to about 80 nt, from about 15 nt to about 50 nt,from about 15 nt to about 40 nt, from about 15 nt to about 30 nt or fromabout 15 nt to about 25 nt.

In some cases, the transcription termination sequence is one that isfunctional in a eukaryotic cell. In some cases, the transcriptiontermination sequence is one that is functional in a prokaryotic cell.

In some cases, the nucleotide sequence that can be included in astability control sequence (e.g., transcriptional termination segment,or in any segment of the guide nucleic acid to provide for increasedstability) includes:

5′-UAAUCCCACAGCCGCCAGUUCCGCUGGCGGCAUUUU-5′ (SEQID NO: 1088 (a Rho-independent trp termination site).

Additional Sequences

In some embodiments, a guide nucleic acid comprises an additionalsegment or segments (in some cases at the 5′ end, in some cases the 3′end, in some cases at either the 5′ or 3′ end, in some cases embeddedwithin the sequence (i.e., not at the 5′ and/or 3′ end), in some casesat both the 5′ end and the 3′ end, in some cases embedded and at the 5′end and/or the 3′ end, etc.). For example, a suitable additional segmentcan comprise a 5′ cap (e.g., a 7-methylguanylate cap (m⁷G)); a 3′polyadenylated tail (i.e., a 3′ poly(A) tail); a ribozyme sequence (e.g.to allow for self-cleavage of a guide nucleic acid (or component of aguide nucleic acid, e.g., a targeter, an activator, etc.)); a riboswitchsequence (e.g., to allow for regulated stability and/or regulatedaccessibility by proteins and protein complexes); a sequence that formsa dsRNA duplex (i.e., a hairpin)); a sequence that targets an RNA to asubcellular location (e.g., nucleus, mitochondria, chloroplasts, and thelike); a modification or sequence that provides for tracking (e.g., adirect label (e.g., direct conjugation to a fluorescent molecule (i.e.,fluorescent dye)), conjugation to a moiety that facilitates fluorescentdetection, a sequence that allows for fluorescent detection; amodification or sequence that provides a binding site for proteins(e.g., proteins that act on DNA, including transcriptional activators,transcriptional repressors, DNA methyltransferases, DNA demethylases,histone acetyltransferases, histone deacetylases, proteins that bind RNA(e.g., RNA aptamers), labeled proteins, fluorescently labeled proteins,and the like); a modification or sequence that provides for increased,decreased, and/or controllable stability; and combinations thereof.

A dual guide nucleic acid can be designed to allow for controlled (i.e.,conditional) binding of a targeter with an activator. Because a dualguide nucleic acid is not functional unless both the activator and thetargeter are bound in a functional complex with Cas9, a dual guidenucleic acid can be inducible (e.g., drug inducible) by rendering thebinding between the activator and the targeter to be inducible. As onenon-limiting example, RNA aptamers can be used to regulate (i.e.,control) the binding of the activator with the targeter. Accordingly,the activator and/or the targeter can include an RNA aptamer sequence.

Aptamers (e.g., RNA aptamers) are known in the art and are generally asynthetic version of a riboswitch. The terms “RNA aptamer” and“riboswitch” are used interchangeably herein to encompass both syntheticand natural nucleic acid sequences that provide for inducible regulationof the structure (and therefore the availability of specific sequences)of the nucleic acid molecule (e.g., RNA, DNA/RNA hybrid, etc.) of whichthey are part. RNA aptamers usually comprise a sequence that folds intoa particular structure (e.g., a hairpin), which specifically binds aparticular drug (e.g., a small molecule). Binding of the drug causes astructural change in the folding of the RNA, which changes a feature ofthe nucleic acid of which the aptamer is a part. As non-limitingexamples: (i) an activator with an aptamer may not be able to bind tothe cognate targeter unless the aptamer is bound by the appropriatedrug; (ii) a targeter with an aptamer may not be able to bind to thecognate activator unless the aptamer is bound by the appropriate drug;and (iii) a targeter and an activator, each comprising a differentaptamer that binds a different drug, may not be able to bind to eachother unless both drugs are present. As illustrated by these examples, adual guide nucleic acid can be designed to be inducible.

Examples of aptamers and riboswitches can be found, for example, in:Nakamura et al., Genes Cells. 2012 May; 17(5):344-64; Vavalle et al.,Future Cardiol. 2012 May; 8(3):371-82; Citartan et al., BiosensBioelectron. 2012 Apr. 15; 34(1):1-11; and Liberman et al., WileyInterdiscip Rev RNA. 2012 May-June; 3(3):369-84; all of which are hereinincorporated by reference in their entirety.

Examples of various guide RNAs (and Cas9 proteins) can be found in theart, for example, see Jinek et al., Science. 2012 Aug. 17;337(6096):816-21; Chylinski et al., RNA Biol. 2013 May; 10(5):726-37; Maet al., Biomed Res Int. 2013; 2013:270805; Hou et al., Proc Natl AcadSci USA. 2013 Sep. 24; 110(39):15644-9; Jinek et al., Elife. 2013;2:e00471; Pattanayak et al., Nat Biotechnol. 2013 September;31(9):839-43; Qi et al, Cell. 2013 Feb. 28; 152(5):1173-83; Wang et al.,Cell. 2013 May 9; 153(4):910-8; Auer et al., Genome Res. 2013 Oct. 31;Chen et al., Nucleic Acids Res. 2013 Nov. 1; 41(20):e19; Cheng et al.,Cell Res. 2013 October; 23(10):1163-71; Cho et al., Genetics. 2013November; 195(3):1177-80; DiCarlo et al., Nucleic Acids Res. 2013 April;41(7):4336-43; Dickinson et al., Nat Methods. 2013 October;10(10):1028-34; Ebina et al., Sci Rep. 2013; 3:2510; Fujii et al,Nucleic Acids Res. 2013 Nov. 1; 41(20):e187; Hu et al., Cell Res. 2013November; 23(11):1322-5; Jiang et al., Nucleic Acids Res. 2013 Nov. 1;41(20):e188; Larson et al., Nat Protoc. 2013 November; 8(11):2180-96;Mali et al., Nat Methods. 2013 October; 10(10):957-63; Nakayama et al.,Genesis. 2013 December; 51(12):835-43; Ran et al., Nat Protoc. 2013November; 8(11):2281-308; Ran et al., Cell. 2013 Sep. 12; 154(6):1380-9;Upadhyay et al., G3 (Bethesda). 2013 Dec. 9; 3(12):2233-8; Walsh et al.,Proc Natl Acad Sci USA. 2013 Sep. 24; 110(39):15514-5; Xie et al., MolPlant. 2013 Oct. 9; Yang et al., Cell. 2013 Sep. 12; 154(6):1370-9;Briner et al., Mol Cell. 2014 Oct. 23; 56(2):333-9; and U.S. patents andpatent applications: U.S. Pat. Nos. 8,906,616; 8,895,308; 8,889,418;8,889,356; 8,871,445; 8,865,406; 8,795,965; 8,771,945; 8,697,359;20140068797; 20140170753; 20140179006; 20140179770; 20140186843;20140186919; 20140186958; 20140189896; 20140227787; 20140234972;20140242664; 20140242699; 20140242700; 20140242702; 20140248702;20140256046; 20140273037; 20140273226; 20140273230; 20140273231;20140273232; 20140273233; 20140273234; 20140273235; 20140287938;20140295556; 20140295557; 20140298547; 20140304853; 20140309487;20140310828; 20140310830; 20140315985; 20140335063; 20140335620;20140342456; 20140342457; 20140342458; 20140349400; 20140349405;20140356867; 20140356956; 20140356958; 20140356959; 20140357523;20140357530; 20140364333; and 20140377868; all of which are herebyincorporated by reference in their entirety.

Cas9 Polypeptides

As noted above, in some cases, a system of the present disclosurecomprises: a) a Cas9 guide RNA, or one or more nucleic acids comprisingnucleotide sequences encoding the Cas9 guide RNA; b) an asymmetricdouble-stranded or single-stranded donor DNA (also referred to as a“donor DNA template” or a “donor DNA molecule”); and c) a Cas9polypeptide or a nucleic acid comprising a nucleotide sequence encodingthe Cas9 polypeptide. The guide nucleic acid provides target specificityto the complex by comprising a nucleotide sequence that is complementaryto a sequence (the target site) of a target nucleic acid (as notedabove). The Cas9 polypeptide of the complex provides the site-specificactivity. In other words, the Cas9 polypeptide is guided to a targetsite within a target nucleic acid sequence (e.g. a chromosomal sequenceor an extrachromosomal sequence, e.g. an episomal sequence, a minicirclesequence, a mitochondrial sequence, a chloroplast sequence, etc.) byvirtue of its association with the protein-binding segment of the guidenucleic acid (described above).

A suitable Cas9 polypeptide can bind and/or modify (e.g., cleave,methylate, demethylate, etc.) a target nucleic acid and/or a polypeptideassociated with target nucleic acid (e.g., methylation or acetylation ofa histone tail). A Cas9 polypeptide is also referred to herein as a“site-directed polypeptide.”

In some cases, the Cas9 polypeptide is a naturally-occurring polypeptide(e.g., naturally occurs in bacterial and/or archaeal cells). In othercases, the Cas9 polypeptide is not a naturally-occurring polypeptide(e.g., the Cas9 polypeptide is a variant Cas9 polypeptide, a chimericpolypeptide as discussed below, and the like).

Exemplary Cas9 polypeptides are set forth in SEQ ID NOs: 5-816 as anon-limiting and non-exhaustive list of Cas9 proteins. Naturallyoccurring Cas9 polypeptides bind a guide nucleic acid (e.g., a Cas9guide RNA), are thereby directed to a specific sequence within a targetnucleic acid (a target site), and cleave the target nucleic acid (e.g.,cleave dsDNA to generate a double strand break. A Cas9 polypeptidecomprises two portions, an RNA-binding portion and an activity portion.An RNA-binding portion interacts with a guide nucleic acid. An activityportion exhibits site-directed activity (e.g., enzymatic activity suchas nuclease activity). In some cases, e.g., when the Cas9 protein is achimeric Cas9 polypeptide, the activity portion can exhibit asite-directed activity (e.g., enzymatic activity) such as DNA and/or RNAmethylation activity, DNA and/or RNA cleavage activity, histoneacetylation activity, histone methylation activity, etc.). In somecases, the activity portion exhibits reduced nuclease activity relativeto the corresponding portion of a wild type Cas9 polypeptide (e.g., theCas9 protein can include one or more mutations in a naturally occurringcatalytic site such as in the RuvC and/or HNH domains).

Assays to determine whether a protein has an RNA-binding portioninteracts with a subject guide nucleic acid can be any convenientbinding assay that tests for binding between a protein and a nucleicacid. Suitable include binding assays (e.g., gel shift assays) thatinclude adding a guide nucleic acid and a Cas9 polypeptide to a targetnucleic acid. Assays to determine whether a protein has an activityportion (e.g., to determine if the polypeptide has nuclease activitythat cleave a target nucleic acid) can be any convenient nucleic acidcleavage assay that tests for nucleic acid cleavage. Suitable cleavageassays that include adding a guide nucleic acid and a Cas9 polypeptideto a target nucleic acid.

Many Cas9 orthologs from a wide variety of species have been identifiedand the proteins share only a few identical amino acids. All identifiedCas9 orthologs have the same domain architecture with a central HNHendonuclease domain and a split RuvC/RNaseH domain (e.g., see Table 1).For example, a Cas9 protein can have 3 different regions (sometimesreferred to as RuvC-I, RuvC-II, and RucC-III, that are not contiguouswith respect to the primary amino acid sequence of the Cas9 protein, butfold together to form a RuvC domain once the protein is produced andfolds. Thus, Cas9 proteins can be said to share at least 4 key motifswith a conserved architecture. Motifs 1, 2, and 4 are RuvC like motifswhile motif 3 is an HNH-motif. The motifs set forth in Table 1 may notrepresent the entire RuvC-like and/or HNH domains as accepted in theart, but Table 1 presents motifs that can be used to help determinewhether a given protein is a Cas9 protein.

TABLE 1Table 1 lists 4 motifs that are present in Cas9 sequences from variousspecies. The amino acids listed here are from the Cas9 from S. pyogenes(SEQ ID NO: 5). Motif # Motif Amino acids (residue #s) Highly conserved1 RuvC-like IGLDIGTNSVGWAVI (7-21) D10, G12, G17 I (SEQ ID NO: 1) 2RuvC-like IVIEMARE (759-766) E762 II (SEQ ID NO: 2) 3 HNH-motifDVDHIVPQSFLKDDSIDNKVLTRSDKN H840, N854, N863 (837-863) (SEQ ID NO: 3) 4RuvC-like HHAHDAYL (982-989) H982, H983, A984, II (SEQ ID NO: 4)D986, A987

In some cases, a suitable Cas9 polypeptide comprises an amino acidsequence having 4 motifs, each of motifs 1-4 having 60% or more, 70% ormore, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more,99% or more or 100% amino acid sequence identity to motifs 1-4 as setforth in SEQ ID NOs: 1-4, respectively (e.g., see Table 1), or to thecorresponding portions in any of the amino acid sequences set forth inSEQ ID NOs: 5-816.

In other words, in some cases, a suitable Cas9 polypeptide comprises anamino acid sequence having 4 motifs, each of motifs 1-4 having 60% ormore, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more,95% or more, 99% or more or 100% amino acid sequence identity to motifs1-4 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5 (e.g., thesequences set forth in SEQ ID NOs: 1-4, e.g., see Table 1), or to thecorresponding portions in any of the amino acid sequences set forth inSEQ ID NOs: 6-816.

In some cases, a suitable Cas9 polypeptide comprises an amino acidsequence having 4 motifs, each of motifs 1-4 having 60% or more aminoacid sequence identity to motifs 1-4 of the Cas9 amino acid sequence setforth in SEQ ID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4,e.g., see Table 1), or to the corresponding portions in any of the aminoacid sequences set forth in SEQ ID NOs: 6-816. In some cases, a suitableCas9 polypeptide comprises an amino acid sequence having 4 motifs, eachof motifs 1-4 having 70% or more amino acid sequence identity to motifs1-4 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5 (e.g., thesequences set forth in SEQ ID NOs: 1-4, e.g., see Table 1), or to thecorresponding portions in any of the amino acid sequences set forth inSEQ ID NOs: 6-816. In some cases, a suitable Cas9 polypeptide comprisesan amino acid sequence having 4 motifs, each of motifs 1-4 having 75% ormore amino acid sequence identity to motifs 1-4 of the Cas9 amino acidsequence set forth in SEQ ID NO: 5 (e.g., the sequences set forth in SEQID NOs: 1-4, e.g., see Table 1), or to the corresponding portions in anyof the amino acid sequences set forth in SEQ ID NOs: 6-816. In somecases, a suitable Cas9 polypeptide comprises an amino acid sequencehaving 4 motifs, each of motifs 1-4 having 80% or more amino acidsequence identity to motifs 1-4 of the Cas9 amino acid sequence setforth in SEQ ID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4,e.g., see Table 1), or to the corresponding portions in any of the aminoacid sequences set forth in SEQ ID NOs: 6-816. In some cases, a suitableCas9 polypeptide comprises an amino acid sequence having 4 motifs, eachof motifs 1-4 having 85% or more amino acid sequence identity to motifs1-4 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5 (e.g., thesequences set forth in SEQ ID NOs: 1-4, e.g., see Table 1), or to thecorresponding portions in any of the amino acid sequences set forth inSEQ ID NOs: 6-816. In some cases, a suitable Cas9 polypeptide comprisesan amino acid sequence having 4 motifs, each of motifs 1-4 having 90% ormore amino acid sequence identity to motifs 1-4 of the Cas9 amino acidsequence set forth in SEQ ID NO: 5 (e.g., the sequences set forth in SEQID NOs: 1-4, e.g., see Table 1), or to the corresponding portions in anyof the amino acid sequences set forth in SEQ ID NOs: 6-816. In somecases, a suitable Cas9 polypeptide comprises an amino acid sequencehaving 4 motifs, each of motifs 1-4 having 95% or more amino acidsequence identity to motifs 1-4 of the Cas9 amino acid sequence setforth in SEQ ID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4,e.g., see Table 1), or to the corresponding portions in any of the aminoacid sequences set forth in SEQ ID NOs: 6-816. In some cases, a suitableCas9 polypeptide comprises an amino acid sequence having 4 motifs, eachof motifs 1-4 having 99% or more amino acid sequence identity to motifs1-4 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5 (e.g., thesequences set forth in SEQ ID NOs: 1-4, e.g., see Table 1), or to thecorresponding portions in any of the amino acid sequences set forth inSEQ ID NOs: 6-816. In some cases, a suitable Cas9 polypeptide comprisesan amino acid sequence having 4 motifs, each of motifs 1-4 having 100%amino acid sequence identity to motifs 1-4 of the Cas9 amino acidsequence set forth in SEQ ID NO: 5 (e.g., the sequences set forth in SEQID NOs: 1-4, e.g., see Table 1), or to the corresponding portions in anyof the amino acid sequences set forth in SEQ ID NOs: 6-816. Any Cas9protein as defined above can be used as a Cas9 polypeptide or as part ofa chimeric Cas9 polypeptide of the subject methods.

In some cases, a suitable Cas9 polypeptide comprises an amino acidsequence having 60% or more, 70% or more, 75% or more, 80% or more, 85%or more, 90% or more, 95% or more, 99% or more or 100% amino acidsequence identity to amino acids 7-166 or 731-1003 of the Cas9 aminoacid sequence set forth in SEQ ID NO: 5, or to the correspondingportions in any of the amino acid sequences set forth as SEQ ID NOs:6-816.

In some cases, a suitable Cas9 polypeptide comprises an amino acidsequence having 60% or more amino acid sequence identity to amino acids7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ IDNO: 5, or to the corresponding portions in any of the amino acidsequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9polypeptide comprises an amino acid sequence having 70% or more aminoacid sequence identity to amino acids 7-166 or 731-1003 of the Cas9amino acid sequence set forth in SEQ ID NO: 5, or to the correspondingportions in any of the amino acid sequences set forth as SEQ ID NOs:6-816.In some cases, a suitable Cas9 polypeptide comprises an amino acidsequence having 75% or more amino acid sequence identity to amino acids7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ IDNO: 5, or to the corresponding portions in any of the amino acidsequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9polypeptide comprises an amino acid sequence having 80% or more aminoacid sequence identity to amino acids 7-166 or 731-1003 of the Cas9amino acid sequence set forth in SEQ ID NO: 5, or to the correspondingportions in any of the amino acid sequences set forth as SEQ ID NOs:6-816. In some cases, a suitable Cas9 polypeptide comprises an aminoacid sequence having 85% or more amino acid sequence identity to aminoacids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQID NO: 5, or to the corresponding portions in any of the amino acidsequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9polypeptide comprises an amino acid sequence having 90% or more aminoacid sequence identity to amino acids 7-166 or 731-1003 of the Cas9amino acid sequence set forth in SEQ ID NO: 5, or to the correspondingportions in any of the amino acid sequences set forth as SEQ ID NOs:6-816. In some cases, a suitable Cas9 polypeptide comprises an aminoacid sequence having 95% or more amino acid sequence identity to aminoacids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQID NO: 5, or to the corresponding portions in any of the amino acidsequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9polypeptide comprises an amino acid sequence having 99% or more aminoacid sequence identity to amino acids 7-166 or 731-1003 of the Cas9amino acid sequence set forth in SEQ ID NO: 5, or to the correspondingportions in any of the amino acid sequences set forth as SEQ ID NOs:6-816.In some cases, a suitable Cas9 polypeptide comprises an amino acidsequence having 100% amino acid sequence identity to amino acids 7-166or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5,or to the corresponding portions in any of the amino acid sequences setforth as SEQ ID NOs: 6-816. Any Cas9 protein as defined above can beused as a Cas9 polypeptide or as part of a chimeric Cas9 polypeptide ofthe subject methods.

In some cases, a suitable Cas9 polypeptide comprises an amino acidsequence having 60% or more, 70% or more, 75% or more, 80% or more, 85%or more, 90% or more, 95% or more, 99% or more or 100% amino acidsequence identity to the Cas9 amino acid sequence set forth in SEQ IDNO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs:6-816.

In some cases, a suitable Cas9 polypeptide comprises an amino acidsequence having 60% or more amino acid sequence identity to the Cas9amino acid sequence set forth in SEQ ID NO: 5, or to any of the aminoacid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitableCas9 polypeptide comprises an amino acid sequence having 70% or moreamino acid sequence identity to the Cas9 amino acid sequence set forthin SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQID NOs: 6-816. In some cases, a suitable Cas9 polypeptide comprises anamino acid sequence having 75% or more amino acid sequence identity tothe Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of theamino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, asuitable Cas9 polypeptide comprises an amino acid sequence having 80% ormore amino acid sequence identity to the Cas9 amino acid sequence setforth in SEQ ID NO: 5, or to any of the amino acid sequences set forthas SEQ ID NOs: 6-816. In some cases, a suitable Cas9 polypeptidecomprises an amino acid sequence having 85% or more amino acid sequenceidentity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, orto any of the amino acid sequences set forth as SEQ ID NOs: 6-816. Insome cases, a suitable Cas9 polypeptide comprises an amino acid sequencehaving 90% or more amino acid sequence identity to the Cas9 amino acidsequence set forth in SEQ ID NO: 5, or to any of the amino acidsequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9polypeptide comprises an amino acid sequence having 95% or more aminoacid sequence identity to the Cas9 amino acid sequence set forth in SEQID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs:6-816. In some cases, a suitable Cas9 polypeptide comprises an aminoacid sequence having 99% or more amino acid sequence identity to theCas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of theamino acid sequences set forth as SEQ ID NOs: 6-816. Any Cas9 protein asdefined above can be used as a Cas9 polypeptide or as part of a chimericCas9 polypeptide of the subject methods. In some cases, a suitable Cas9polypeptide comprises an amino acid sequence having 100% amino acidsequence identity to the Cas9 amino acid sequence set forth in SEQ IDNO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs:6-816.

As used herein, the term “Cas9 polypeptide” encompasses the term“variant Cas9 polypeptide”; and the term “variant Cas9 polypeptide”encompasses the term “chimeric Cas9 polypeptide.”

Variant Cas9 Polypeptides

In some cases, a suitable Cas9 polypeptide is a variant Cas9polypeptide. A variant Cas9 polypeptide has an amino acid sequence thatis different by one amino acid (e.g., has a deletion, insertion,substitution, fusion) (i.e., different by at least one amino acid) whencompared to the amino acid sequence of a wild type Cas9 polypeptide. Insome instances, the variant Cas9 polypeptide has an amino acid change(e.g., deletion, insertion, or substitution) that reduces the nucleaseactivity of the Cas9 polypeptide. For example, in some instances, thevariant Cas9 polypeptide has less than 50%, less than 40%, less than30%, less than 20%, less than 10%, less than 5%, or less than 1% of thenuclease activity of the corresponding wild-type Cas9 polypeptide. Insome cases, the variant Cas9 polypeptide has no substantial nucleaseactivity. When a subject Cas9 polypeptide is a variant Cas9 polypeptidethat has no substantial nuclease activity, it can be referred to as“dead Cas9” or “dCas9.”

In some cases, a variant Cas9 polypeptide has reduced nuclease activity(e.g., a variant Cas9 protein can include a mutation in one or morecatalytic domains). In some cases, a variant Cas9 polypeptide can cleavethe complementary strand of a target nucleic acid but has reducedability to cleave the non-complementary strand of a double strandedtarget nucleic acid. For example, the variant Cas9 polypeptide can havea mutation (amino acid substitution) that reduces the function of theRuvC domain (a catalytic domain). As a non-limiting example, in someembodiments, a variant Cas9 polypeptide has a mutation at position D10(e.g., D10A; aspartate to alanine at amino acid position 10) of SEQ IDNO: 5 (or the corresponding position of any of the proteins presented inSEQ ID NOs: 6-816) and can therefore cleave the complementary strand ofa double stranded target nucleic acid but has reduced ability to cleavethe non-complementary strand of a double stranded target nucleic acid(thus resulting in a single strand break (SSB) instead of a doublestrand break (DSB) when the variant Cas9 polypeptide cleaves a doublestranded target nucleic acid) (see, for example, Jinek et al., Science.2012 Aug. 17; 337(6096):816-21).

In some cases, a variant Cas9 polypeptide can cleave thenon-complementary strand of a double stranded target nucleic acid buthas reduced ability to cleave the complementary strand of the targetnucleic acid. For example, the variant Cas9 polypeptide can have amutation (amino acid substitution) that reduces the function of the HNHdomain (a catalytic domain) As a non-limiting example, in someembodiments, a variant Cas9 polypeptide has a mutation at position H840(e.g., H840A; histidine to alanine at amino acid position 840) of SEQ IDNO: 5 (or the corresponding position of any of the proteins presented inSEQ ID NOs: 6-816) and can therefore cleave the non-complementary strandof the target nucleic acid but has reduced ability to cleave thecomplementary strand of the target nucleic acid (thus resulting in a SSBinstead of a DSB when the variant Cas9 polypeptide cleaves a doublestranded target nucleic acid). Such a Cas9 polypeptide has a reducedability to cleave a target nucleic acid (e.g., a single stranded targetnucleic acid) but retains the ability to bind a target nucleic acid(e.g., a single stranded target nucleic acid).

In some cases, a variant Cas9 polypeptide has a reduced ability tocleave both the complementary and the non-complementary strands of adouble stranded target nucleic acid. As a non-limiting example, in somecases, the variant Cas9 polypeptide harbors mutations in both the RuvCand HNH domains (e.g., mutations at both the D10 and H840 positions,e.g., both the D10A and the H840A mutations, or the correspondingmutations of any of the proteins set forth as SEQ ID NOs: 5-816) suchthat the polypeptide has a reduced ability to cleave both thecomplementary and the non-complementary strands of a double strandedtarget nucleic acid. Such a Cas9 polypeptide can have a reduced abilityto cleave a target nucleic acid but retain the ability to bind a targetnucleic acid.

Other residues can be mutated to achieve the above effects (i.e.inactivate one or the other nuclease domains). As non-limiting examples,residues D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986,and/or A987 of the Cas9 protein set forth in SEQ ID NO: 5 (or thecorresponding residues of any of the proteins set forth as SEQ ID NOs:6-816) can be altered (i.e., substituted) (e.g., see Table 1 for moreinformation regarding the conservation of Cas9 amino acid residues,e.g., those that are highly conserved and are included in the motifslisted in Table 1). Also, mutations other than alanine substitutions aresuitable.

In some embodiments, a variant Cas9 polypeptide that has reducedcatalytic activity (e.g., when a Cas9 protein has a D10, G12, G17, E762,H840, N854, N863, H982, H983, A984, D986, and/or a A987 mutation, e.g.,D10A, G12A, G17A, E762A, H840A, N854A, N863A, H982A, H983A, A984A,and/or D986A), the variant Cas9 polypeptide can still bind to targetnucleic acid in a site-specific manner (because it is still guided to atarget nucleic acid sequence by a guide nucleic acid) as long as itretains the ability to interact with the guide nucleic acid.

In addition to the above, a variant Cas9 protein can have the sameparameters for sequence identity as described above for Cas9polypeptides. Thus, in some cases, a suitable variant Cas9 polypeptidecomprises an amino acid sequence having 4 motifs, each of motifs 1-4having 60% or more, 70% or more, 75% or more, 80% or more, 85% or more,90% or more, 95% or more, 99% or more or 100% amino acid sequenceidentity to motifs 1-4 of a Cas9 protein, e.g., as set forth in SEQ IDNOs: 1-4, respectively, as depicted in Table 1, or to the correspondingportions in any of the amino acid sequences set forth in SEQ ID NOs:5-816.

For example, in some cases, a suitable variant Cas9 polypeptidecomprises an amino acid sequence having 4 motifs, each of motifs 1-4having 60% or more, 70% or more, 75% or more, 80% or more, 85% or more,90% or more, 95% or more, 99% or more or 100% amino acid sequenceidentity to motifs 1-4 of the Cas9 amino acid sequence set forth in SEQID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4, e.g., seeTable 1), or to the corresponding portions in any of the amino acidsequences set forth in SEQ ID NOs: 6-816.

In some cases, a suitable variant Cas9 polypeptide comprises an aminoacid sequence having 4 motifs, each of motifs 1-4 having 60% or moreamino acid sequence identity to motifs 1-4 of the Cas9 amino acidsequence set forth in SEQ ID NO: 5 (e.g., the sequences set forth in SEQID NOs: 1-4, e.g., see Table 1), or to the corresponding portions in anyof the amino acid sequences set forth in SEQ ID NOs: 6-816. In somecases, a suitable variant Cas9 polypeptide comprises an amino acidsequence having 4 motifs, each of motifs 1-4 having 70% or more aminoacid sequence identity to motifs 1-4 of the Cas9 amino acid sequence setforth in SEQ ID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4,e.g., see Table 1), or to the corresponding portions in any of the aminoacid sequences set forth in SEQ ID NOs: 6-816. In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 4motifs, each of motifs 1-4 having 75% or more amino acid sequenceidentity to motifs 1-4 of the Cas9 amino acid sequence set forth in SEQID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4, e.g., seeTable 1), or to the corresponding portions in any of the amino acidsequences set forth in SEQ ID NOs: 6-816. In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 4motifs, each of motifs 1-4 having 80% or more amino acid sequenceidentity to motifs 1-4 of the Cas9 amino acid sequence set forth in SEQID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4, e.g., seeTable 1), or to the corresponding portions in any of the amino acidsequences set forth in SEQ ID NOs: 6-816. In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 4motifs, each of motifs 1-4 having 85% or more amino acid sequenceidentity to motifs 1-4 of the Cas9 amino acid sequence set forth in SEQID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4, e.g., seeTable 1), or to the corresponding portions in any of the amino acidsequences set forth in SEQ ID NOs: 6-816. In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 4motifs, each of motifs 1-4 having 90% or more amino acid sequenceidentity to motifs 1-4 of the Cas9 amino acid sequence set forth in SEQID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4, e.g., seeTable 1), or to the corresponding portions in any of the amino acidsequences set forth in SEQ ID NOs: 6-816. In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 4motifs, each of motifs 1-4 having 95% or more amino acid sequenceidentity to motifs 1-4 of the Cas9 amino acid sequence set forth in SEQID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4, e.g., seeTable 1), or to the corresponding portions in any of the amino acidsequences set forth in SEQ ID NOs: 6-816. In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 4motifs, each of motifs 1-4 having 99% or more amino acid sequenceidentity to motifs 1-4 of the Cas9 amino acid sequence set forth in SEQID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4, e.g., seeTable 1), or to the corresponding portions in any of the amino acidsequences set forth in SEQ ID NOs: 6-816.In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 4motifs, each of motifs 1-4 having 100% amino acid sequence identity tomotifs 1-4 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5(e.g., the sequences set forth in SEQ ID NOs: 1-4, e.g., see Table 1),or to the corresponding portions in any of the amino acid sequences setforth in SEQ ID NOs: 6-816. Any Cas9 protein as defined above can beused as a variant Cas9 polypeptide or as part of a chimeric variant Cas9polypeptide of the subject methods.

In some cases, a suitable variant Cas9 polypeptide comprises an aminoacid sequence having 60% or more, 70% or more, 75% or more, 80% or more,85% or more, 90% or more, 95% or more, 99% or more, or 100% amino acidsequence identity to amino acids 7-166 or 731-1003 of the Cas9 aminoacid sequence set forth in SEQ ID NO: 5, or to the correspondingportions in any of the amino acid sequences set forth as SEQ ID NOs:6-816.

In some cases, a suitable variant Cas9 polypeptide comprises an aminoacid sequence having 60% or more amino acid sequence identity to aminoacids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQID NO: 5, or to the corresponding portions in any of the amino acidsequences set forth as SEQ ID NOs: 6-816. Any Cas9 protein as definedabove can be used as a variant Cas9 polypeptide or as part of a chimericvariant Cas9 polypeptide of the subject methods. In some cases, asuitable variant Cas9 polypeptide comprises an amino acid sequencehaving 70% or more amino acid sequence identity to amino acids 7-166 or731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, orto the corresponding portions in any of the amino acid sequences setforth as SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9polypeptide comprises an amino acid sequence having 75% or more aminoacid sequence identity to amino acids 7-166 or 731-1003 of the Cas9amino acid sequence set forth in SEQ ID NO: 5, or to the correspondingportions in any of the amino acid sequences set forth as SEQ ID NOs:6-816. In some cases, a suitable variant Cas9 polypeptide comprises anamino acid sequence having 80% or more amino acid sequence identity toamino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forthin SEQ ID NO: 5, or to the corresponding portions in any of the aminoacid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 85% ormore amino acid sequence identity to amino acids 7-166 or 731-1003 ofthe Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to thecorresponding portions in any of the amino acid sequences set forth asSEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 polypeptidecomprises an amino acid sequence having 90% or more amino acid sequenceidentity to amino acids 7-166 or 731-1003 of the Cas9 amino acidsequence set forth in SEQ ID NO: 5, or to the corresponding portions inany of the amino acid sequences set forth as SEQ ID NOs: 6-816. In somecases, a suitable variant Cas9 polypeptide comprises an amino acidsequence having 95% or more amino acid sequence identity to amino acids7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ IDNO: 5, or to the corresponding portions in any of the amino acidsequences set forth as SEQ ID NOs: 6-816. In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 99% ormore amino acid sequence identity to amino acids 7-166 or 731-1003 ofthe Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to thecorresponding portions in any of the amino acid sequences set forth asSEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 polypeptidecomprises an amino acid sequence having 100% amino acid sequenceidentity to amino acids 7-166 or 731-1003 of the Cas9 amino acidsequence set forth in SEQ ID NO: 5, or to the corresponding portions inany of the amino acid sequences set forth as SEQ ID NOs: 6-816. Any Cas9protein as defined above can be used as a variant Cas9 polypeptide or aspart of a chimeric variant Cas9 polypeptide of the subject methods.

In some cases, a suitable variant Cas9 polypeptide comprises an aminoacid sequence having 60% or more, 70% or more, 75% or more, 80% or more,85% or more, 90% or more, 95% or more, 99% or more, or 100% amino acidsequence identity to the Cas9 amino acid sequence set forth in SEQ IDNO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs:6-816.

In some cases, a suitable variant Cas9 polypeptide comprises an aminoacid sequence having 60% or more amino acid sequence identity to theCas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of theamino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, asuitable variant Cas9 polypeptide comprises an amino acid sequencehaving 70% or more amino acid sequence identity to the Cas9 amino acidsequence set forth in SEQ ID NO: 5, or to any of the amino acidsequences set forth as SEQ ID NOs: 6-816. In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 75% ormore amino acid sequence identity to the Cas9 amino acid sequence setforth in SEQ ID NO: 5, or to any of the amino acid sequences set forthas SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 polypeptidecomprises an amino acid sequence having 80% or more amino acid sequenceidentity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, orto any of the amino acid sequences set forth as SEQ ID NOs: 6-816. Insome cases, a suitable variant Cas9 polypeptide comprises an amino acidsequence having 85% or more amino acid sequence identity to the Cas9amino acid sequence set forth in SEQ ID NO: 5, or to any of the aminoacid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitablevariant Cas9 polypeptide comprises an amino acid sequence having 90% ormore amino acid sequence identity to the Cas9 amino acid sequence setforth in SEQ ID NO: 5, or to any of the amino acid sequences set forthas SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 polypeptidecomprises an amino acid sequence having 95% or more amino acid sequenceidentity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, orto any of the amino acid sequences set forth as SEQ ID NOs: 6-816. Insome cases, a suitable variant Cas9 polypeptide comprises an amino acidsequence having 99% or more amino acid sequence identity to the Cas9amino acid sequence set forth in SEQ ID NO: 5, or to any of the aminoacid sequences set forth as SEQ ID NOs: 6-816. Any Cas9 protein asdefined above can be used as a variant Cas9 polypeptide or as part of achimeric variant Cas9 polypeptide of the subject methods. In some cases,a suitable variant Cas9 polypeptide (e.g., a chimeric Cas9 protein,i.e., a Cas9 fusion protein) comprises an amino acid sequence having100% amino acid sequence identity to the Cas9 amino acid sequence setforth in SEQ ID NO: 5, or to any of the amino acid sequences set forthas SEQ ID NOs: 6-816.

Chimeric Polypeptides (Fusion Polypeptides)

In some embodiments, a variant Cas9 polypeptide is a chimeric Cas9polypeptide (also referred to herein as a fusion polypeptide, e.g., a“Cas9 fusion polypeptide”). A Cas9 fusion polypeptide can bind and/ormodify a target nucleic acid (e.g., cleave, methylate, demethylate,etc.) and/or a polypeptide associated with target nucleic acid (e.g.,methylation, acetylation, etc., of, for example, a histone tail).

A Cas9 fusion polypeptide is a variant Cas9 polypeptide by virtue ofdiffering in sequence from a wild type Cas9 polypeptide. A Cas9 fusionpolypeptide is a Cas9 polypeptide (e.g., a wild type Cas9 polypeptide, avariant Cas9 polypeptide, a variant Cas9 polypeptide with reducednuclease activity (as described above), and the like) fused to acovalently linked heterologous polypeptide (also referred to as a“fusion partner”). In some cases, a Cas9 fusion polypeptide is a variantCas9 polypeptide with reduced nuclease activity (e.g., a nickase Cas9protein (i.e., a Cas9 protein without catalytic function of one of theRuvC or HNH domains), a dCas9, etc.) fused to a covalently linkedheterologous polypeptide. In some cases, the heterologous polypeptideexhibits (and therefore provides for) an activity (e.g., an enzymaticactivity) that will also be exhibited by the Cas9 fusion polypeptide(e.g., methyltransferase activity, acetyltransferase activity, kinaseactivity, ubiquitinating activity, etc.). In some such cases, a methodof binding, e.g., where the Cas9 polypeptide is a variant Cas9polypeptide having a fusion partner (i.e., having a heterologouspolypeptide) with an activity (e.g., an enzymatic activity) thatmodifies the target nucleic acid, the method can also be considered tobe a method of modifying the target nucleic acid. In some cases, amethod of binding a target nucleic acid (e.g., a single stranded targetnucleic acid) can result in modification of the target nucleic acid.Thus, in some cases, a method of binding a target nucleic acid (e.g., asingle stranded target nucleic acid) can be a method of modifying thetarget nucleic acid.

In some cases, the heterologous sequence provides for subcellularlocalization, i.e., the heterologous sequence is a subcellularlocalization sequence (e.g., one or more nuclear localization signals(NLSs) for targeting to the nucleus, two or more NLSs, three or moreNLSs, a sequence to keep the fusion protein out of the nucleus, e.g., anuclear export sequence (NES), a sequence to keep the fusion proteinretained in the cytoplasm, a mitochondrial localization signal fortargeting to the mitochondria, a chloroplast localization signal fortargeting to a chloroplast, an ER retention signal, and the like). Insome embodiments, a variant Cas9 does not include a NLS so that theprotein is not targeted to the nucleus. In some embodiments, theheterologous sequence can provide a tag (i.e., the heterologous sequenceis a detectable label) for ease of tracking and/or purification (e.g., afluorescent protein, e.g., green fluorescent protein (GFP), YFP, RFP,CFP, mCherry, tdTomato, and the like; a histidine tag, e.g., a 6× Histag; a hemagglutinin (HA) tag; a FLAG tag; a Myc tag; and the like). Insome embodiments, the heterologous sequence can provide for increased ordecreased stability (i.e., the heterologous sequence is a stabilitycontrol peptide, e.g., a degron, which in some cases is controllable(e.g., a temperature sensitive or drug controllable degron sequence, seebelow). In some embodiments, the heterologous sequence can provide forincreased or decreased transcription from the target nucleic acid (i.e.,the heterologous sequence is a transcription modulation sequence, e.g.,a transcription factor/activator or a fragment thereof, a protein orfragment thereof that recruits a transcription factor/activator, atranscription repressor or a fragment thereof, a protein or fragmentthereof that recruits a transcription repressor, a smallmolecule/drug-responsive transcription regulator, etc.). In someembodiments, the heterologous sequence can provide a binding domain(i.e., the heterologous sequence is a protein binding sequence, e.g., toprovide the ability of a Cas9 fusion polypeptide to bind to anotherprotein of interest, e.g., a DNA or histone modifying protein, atranscription factor or transcription repressor, a recruiting protein,an RNA modification enzyme, an RNA-binding protein, a translationinitiation factor, an RNA splicing factor, etc.). A heterologous nucleicacid sequence may be linked to another nucleic acid sequence (e.g., bygenetic engineering) to generate a chimeric nucleotide sequence encodinga chimeric polypeptide.

In some cases, a Cas9 fusion polypeptide may be fused to a polypeptidepermeant domain to promote uptake by the cell. A number of permeantdomains are known in the art and may be used in the non-integratingpolypeptides of the present disclosure, including peptides,peptidomimetics, and non-peptide carriers. For example, a permeantpeptide may be derived from the third alpha helix of Drosophilamelanogaster transcription factor Antennapaedia, referred to aspenetratin, which comprises the amino acid sequence RQIKIWFQNRRMKWKK(SEQ ID NO: 826). As another example, the permeant peptide comprises theHIV-1 tat basic region amino acid sequence, which may include, forexample, amino acids 49-57 of naturally-occurring tat protein. Otherpermeant domains include poly-arginine motifs, for example, the regionof amino acids 34-56 of HIV-1 rev protein, nona-arginine, octa-arginine,and the like. (See, for example, Futaki et al. (2003) Curr Protein PeptSci. 2003 April; 4(2): 87-9 and 446; and Wender et al. (2000) Proc.Natl. Acad. Sci. U.S.A 2000 Nov. 21; 97(24):13003-8; published U.S.Patent applications 20030220334; 20030083256; 20030032593; and20030022831, herein specifically incorporated by reference for theteachings of translocation peptides and peptoids). The nona-arginine(R9) sequence is one of the more efficient PTDs that have beencharacterized (Wender et al. 2000; Uemura et al. 2002). The site atwhich the fusion is made may be selected in order to optimize thebiological activity, secretion or binding characteristics of thepolypeptide. The optimal site will be determined by routineexperimentation.

In some cases, a Cas9 fusion polypeptide includes a “ProteinTransduction Domain” or PTD (also known as a CPP—cell penetratingpeptide), which refers to a polypeptide, polynucleotide, carbohydrate,or organic or inorganic compound that facilitates traversing a lipidbilayer, micelle, cell membrane, organelle membrane, or vesiclemembrane. A PTD attached to another molecule, which can range from asmall polar molecule to a large macromolecule and/or a nanoparticle,facilitates the molecule traversing a membrane, for example going fromextracellular space to intracellular space, or cytosol to within anorganelle. In some embodiments, a PTD is covalently linked to the aminoterminus a polypeptide (e.g., a Cas9 fusion polypeptide). In someembodiments, a PTD is covalently linked to the carboxyl terminus of apolypeptide (e.g., a Cas9 fusion polypeptide). In some cases, the PTD isinserted interally in the Cas9 fusion polypeptide (i.e., is not at theN- or C-terminus of the Cas9 fusion polypeptide) at a suitable insertionsite, as described herein. In some cases, a subject Cas9 fusionpolypeptide includes (is conjugated to, is fused to) one or more PTDs(e.g., two or more, three or more, four or more PTDs). In some cases aPTD includes a nuclear localization signal (NLS) (e.g, in some cases 2or more, 3 or more, 4 or more, or 5 or more NLSs). Thus, in some cases,a Cas9 fusion polypeptide includes one or more NLSs (e.g., 2 or more, 3or more, 4 or more, or 5 or more NLSs). In some embodiments, a PTD iscovalently linked to a nucleic acid (e.g., a Cas9 guide nucleic acid, apolynucleotide encoding a Cas9 guide nucleic acid, a polynucleotideencoding a Cas9 fusion polypeptide, a donor polynucleotide, etc.).Examples of PTDs include but are not limited to a minimal undecapeptideprotein transduction domain (corresponding to residues 47-57 of HIV-1TAT comprising YGRKKRRQRRR; SEQ ID NO:1076); a polyarginine sequencecomprising a number of arginines sufficient to direct entry into a cell(e.g., 3, 4, 5, 6, 7, 8, 9, 10, or 10-50 arginines); a VP22 domain(Zender et al. (2002) Cancer Gene Ther. 9(6):489-96); an DrosophilaAntennapedia protein transduction domain (Noguchi et al. (2003) Diabetes52(7):1732-1737); a truncated human calcitonin peptide (Trehin et al.(2004) Pharm. Research 21:1248-1256); polylysine (Wender et al. (2000)Proc. Natl. Acad. Sci. USA 97:13003-13008); RRQRRTSKLMKR (SEQ IDNO:1077); Transportan GWTLNSAGYLLGKINLKALAALAKKIL (SEQ ID NO:1078);KALAWEAKLAKALAKALAKHLAKALAKALKCEA (SEQ ID NO:1079); and RQIKIWFQNRRMKWKK(SEQ ID NO:1080). Exemplary PTDs include but are not limited to,YGRKKRRQRRR (SEQ ID NO:1081), RKKRRQRRR (SEQ ID NO:1082); an argininehomopolymer of from 3 arginine residues to 50 arginine residues;Exemplary PTD domain amino acid sequences include, but are not limitedto, any of the following: YGRKKRRQRRR (SEQ ID NO:1083); RKKRRQRR (SEQ IDNO:1084); YARAAARQARA (SEQ ID NO:1085); THRLPRRRRRR (SEQ ID NO:1086);and GGRRARRRRRR (SEQ ID NO:1087). In some embodiments, the PTD is anactivatable CPP (ACPP) (Aguilera et al. (2009) Integr Biol (Camb) June;1(5-6): 371-381). ACPPs comprise a polycationic CPP (e.g., Arg9 or “R9”)connected via a cleavable linker to a matching polyanion (e.g., Glu9 or“E9”), which reduces the net charge to nearly zero and thereby inhibitsadhesion and uptake into cells. Upon cleavage of the linker, thepolyanion is released, locally unmasking the polyarginine and itsinherent adhesiveness, thus “activating” the ACPP to traverse themembrane.

A subject Cas9 fusion polypeptide (Cas9 fusion protein) can havemultiple (1 or more, 2 or more, 3 or more, etc.) fusion partners in anycombination of the above. As an illustrative example, a Cas9 fusionprotein can have a heterologous sequence that provides an activity(e.g., for transcription modulation, target modification, modificationof a protein associated with a target nucleic acid, etc.) and can alsohave a subcellular localization sequence. In some cases, such a Cas9fusion protein might also have a tag for ease of tracking and/orpurification (e.g., green fluorescent protein (GFP), YFP, RFP, CFP,mCherry, tdTomato, and the like; a histidine tag, e.g., a 6× His tag; ahemagglutinin (HA) tag; a FLAG tag; a Myc tag; and the like). As anotherillustrative example, a Cas9 protein can have one or more NLSs (e.g.,two or more, three or more, four or more, five or more, 1, 2, 3, 4, or 5NLSs). In some cases a fusion partner (or multiple fusion partners)(e.g., an NLS, a tag, a fusion partner providing an activity, etc.) islocated at or near the C-terminus of Cas9. In some cases a fusionpartner (or multiple fusion partners) (e.g., an NLS, a tag, a fusionpartner providing an activity, etc.) is located at the N-terminus ofCas9. In some cases a Cas9 has a fusion partner (or multiple fusionpartners) (e.g., an NLS, a tag, a fusion partner providing an activity,etc.) at both the N-terminus and C-terminus.

Suitable fusion partners that provide for increased or decreasedstability include, but are not limited to degron sequences. Degrons arereadily understood by one of ordinary skill in the art to be amino acidsequences that control the stability of the protein of which they arepart. For example, the stability of a protein comprising a degronsequence is controlled in part by the degron sequence. In some cases, asuitable degron is constitutive such that the degron exerts itsinfluence on protein stability independent of experimental control(i.e., the degron is not drug inducible, temperature inducible, etc.) Insome cases, the degron provides the variant Cas9 polypeptide withcontrollable stability such that the variant Cas9 polypeptide can beturned “on” (i.e., stable) or “off” (i.e., unstable, degraded) dependingon the desired conditions. For example, if the degron is a temperaturesensitive degron, the variant Cas9 polypeptide may be functional (i.e.,“on”, stable) below a threshold temperature (e.g., 42° C., 41° C., 40°C., 39° C., 38° C., 37° C., 36° C., 35° C., 34° C., 33° C., 32° C., 31°C., 30° C., etc.) but non-functional (i.e., “off”, degraded) above thethreshold temperature. As another example, if the degron is a druginducible degron, the presence or absence of drug can switch the proteinfrom an “off” (i.e., unstable) state to an “on” (i.e., stable) state orvice versa. An exemplary drug inducible degron is derived from theFKBP12 protein. The stability of the degron is controlled by thepresence or absence of a small molecule that binds to the degron.

Examples of suitable degrons include, but are not limited to thosedegrons controlled by Shield-1, DHFR, auxins, and/or temperature.Non-limiting examples of suitable degrons are known in the art (e.g.,Dohmen et al., Science, 1994. 263(5151): p. 1273-1276: Heat-inducibledegron: a method for constructing temperature-sensitive mutants;Schoeber et al., Am J Physiol Renal Physiol. 2009 January;296(1):F204-11: Conditional fast expression and function of multimericTRPVS channels using Shield-1; Chu et al., Bioorg Med Chem Lett. 2008Nov. 15; 18(22):5941-4: Recent progress with FKBP-derived destabilizingdomains ; Kanemaki, Pflugers Arch. 2012 Dec. 28: Frontiers of proteinexpression control with conditional degrons; Yang et al., Mol Cell. 2012Nov. 30; 48(4):487-8: Titivated for destruction: the methyl degron;Barbour et al., Biosci Rep. 2013 Jan. 18; 33(1).: Characterization ofthe bipartite degron that regulates ubiquitin-independent degradation ofthymidylate synthase; and Greussing et al., J Vis Exp. 2012 Nov. 10;(69): Monitoring of ubiquitin-proteasome activity in living cells usinga Degron (dgn)-destabilized green fluorescent protein (GFP)-basedreporter protein; all of which are hereby incorporated in their entiretyby reference).

Exemplary degron sequences have been well-characterized and tested inboth cells and animals Thus, fusing Cas9 (e.g., wild type Cas9; variantCas9; variant Cas9 with reduced nuclease activity, e.g., dCas9; and thelike) to a degron sequence produces a “tunable” and “inducible” Cas9polypeptide. Any of the fusion partners described herein can be used inany desirable combination. As one non-limiting example to illustratethis point, a Cas9 fusion protein (i.e., a chimeric Cas9 polypeptide)can comprise a YFP sequence for detection, a degron sequence forstability, and transcription activator sequence to increasetranscription of the target nucleic acid. A suitable reporter proteinfor use as a fusion partner for a Cas9 polypeptide (e.g., wild typeCas9, variant Cas9, variant Cas9 with reduced nuclease function, etc.),includes, but is not limited to, the following exemplary proteins (orfunctional fragment thereof): his3, β-galatosidase, a fluorescentprotein (e.g., GFP, RFP, YFP, cherry, tomato, etc., and variousderivatives thereof), luciferase, β-glucuronidase, and alkalinephosphatase. Furthermore, the number of fusion partners that can be usedin a Cas9 fusion protein is unlimited. In some cases, a Cas9 fusionprotein comprises one or more (e.g. two or more, three or more, four ormore, or five or more) heterologous sequences.

Suitable fusion partners include, but are not limited to, a polypeptidethat provides for methyltransferase activity, demethylase activity,acetyltransferase activity, deacetylase activity, kinase activity,phosphatase activity, ubiquitin ligase activity, deubiquitinatingactivity, adenylation activity, deadenylation activity, SUMOylatingactivity, deSUMOylating activity, ribosylation activity, deribosylationactivity, myristoylation activity, or demyristoylation activity, any ofwhich can be directed at modifying nucleic acid directly (e.g.,methylation of DNA or RNA) or at modifying a nucleic acid-associatedpolypeptide (e.g., a histone, a DNA binding protein, and RNA bindingprotein, and the like). Further suitable fusion partners include, butare not limited to boundary elements (e.g., CTCF), proteins andfragments thereof that provide periphery recruitment (e.g., Lamin A,Lamin B, etc.), and protein docking elements (e.g., FKBP/FRB, Pil1/Aby1,etc.).

Examples of various additional suitable fusion partners (or fragmentsthereof) for a subject variant Cas9 polypeptide include, but are notlimited to those described in the PCT patent applications: WO2010075303,WO2012068627, and WO2013155555 which are hereby incorporated byreference in their entirety.

Suitable fusion partners include, but are not limited to, a polypeptidethat provides an activity that indirectly increases transcription byacting directly on the target nucleic acid or on a polypeptide (e.g., ahistone, a DNA-binding protein, an RNA-binding protein, an RNA editingprotein, etc.) associated with the target nucleic acid. Suitable fusionpartners include, but are not limited to, a polypeptide that providesfor methyltransferase activity, demethylase activity, acetyltransferaseactivity, deacetylase activity, kinase activity, phosphatase activity,ubiquitin ligase activity, deubiquitinating activity, adenylationactivity, deadenylation activity, SUMOylating activity, deSUMOylatingactivity, ribosylation activity, deribosylation activity, myristoylationactivity, or demyristoylation activity.

Additional suitable fusion partners include, but are not limited to, apolypeptide that directly provides for increased transcription and/ortranslation of a target nucleic acid (e.g., a transcription activator ora fragment thereof, a protein or fragment thereof that recruits atranscription activator, a small molecule/drug-responsive transcriptionand/or translation regulator, a translation-regulating protein, etc.).

Non-limiting examples of fusion partners to accomplish increased ordecreased transcription include transcription activator andtranscription repressor domains (e.g., the Krüppel associated box (KRABor SKD); the Mad mSIN3 interaction domain (SID); the ERF repressordomain (ERD), etc.). In some such cases, a Cas9 fusion protein istargeted by the guide nucleic acid to a specific location (i.e.,sequence) in the target nucleic acid and exerts locus-specificregulation such as blocking RNA polymerase binding to a promoter (whichselectively inhibits transcription activator function), and/or modifyingthe local chromatin status (e.g., when a fusion sequence is used thatmodifies the target nucleic acid or modifies a polypeptide associatedwith the target nucleic acid). In some cases, the changes are transient(e.g., transcription repression or activation). In some cases, thechanges are inheritable (e.g., when epigenetic modifications are made tothe target nucleic acid or to proteins associated with the targetnucleic acid, e.g., nucleosomal histones).

In some embodiments, the heterologous sequence can be fused to theC-terminus of the Cas9 polypeptide. In some embodiments, theheterologous sequence can be fused to the N-terminus of the Cas9polypeptide. In some embodiments, the heterologous sequence can be fusedto an internal portion (i.e., a portion other than the N- or C-terminus)of the Cas9 polypeptide.

In some embodiments, a Cas9 polypeptide (e.g., a wild type Cas9, avariant Cas9, a variant Cas9 with reduced nuclease activity, a nickaseCas9, etc.) can be linked to a fusion partner via a peptide spacer(e.g., a linker). Exemplary linker polypeptides include glycine polymers(G)_(n), glycine-serine polymers (including, for example, (GS)_(n),GSGGS_(n) (SEQ ID NO: 817), GGSGGS_(n) (SEQ ID NO: 818), and GGGS_(n)(SEQ ID NO: 819), where n is an integer of at least one),glycine-alanine polymers, alanine-serine polymers. Exemplary linkers cancomprise amino acid sequences including, but not limited to, GGSG (SEQID NO: 820), GGSGG (SEQ ID NO: 821), GSGSG (SEQ ID NO: 822), GSGGG (SEQID NO: 823), GGGSG (SEQ ID NO: 824), GSSSG (SEQ ID NO: 825), and thelike. The ordinarily skilled artisan will recognize that design of apeptide conjugated to any elements described above can include linkersthat are all or partially flexible, such that the linker can include aflexible linker as well as one or more portions that confer lessflexible structure.

Examples of various Cas9 proteins (and guide RNAs) (e.g., includingvariant Cas9 proteins, chimeric Cas9 proteins, i.e., Cas9 fusionproteins) can be found in the art, for example, see Jinek et al.,Science. 2012 Aug. 17; 337(6096):816-21; Chylinski et al., RNA Biol.2013 May; 10(5):726-37; Ma et al., Biomed Res Int. 2013; 2013:270805;Hou et al., Proc Natl Acad Sci USA. 2013 Sep. 24; 110(39):15644-9; Jineket al., Elife. 2013; 2:e00471; Pattanayak et al., Nat Biotechnol. 2013September; 31(9):839-43; Qi et al, Cell. 2013 Feb. 28; 152(5):1173-83;Wang et al., Cell. 2013 May 9; 153(4):910-8; Auer et al., Genome Res.2013 Oct. 31; Chen et al., Nucleic Acids Res. 2013 Nov. 1; 41(20):e19;Cheng et al., Cell Res. 2013 October; 23(10):1163-71; Cho et al.,Genetics. 2013 November; 195(3):1177-80; DiCarlo et al., Nucleic AcidsRes. 2013 April; 41(7):4336-43; Dickinson et al., Nat Methods. 2013October; 10(10):1028-34; Ebina et al., Sci Rep. 2013; 3:2510; Fujii et.al, Nucleic Acids Res. 2013 Nov. 1; 41(20):e187; Hu et al., Cell Res.2013 November; 23(11):1322-5; Jiang et al., Nucleic Acids Res. 2013 Nov.1; 41(20):e188; Larson et al., Nat Protoc. 2013 November; 8(11):2180-96;Mali et. at., Nat Methods. 2013 October; 10(10):957-63; Nakayama et al.,Genesis. 2013 December; 51(12):835-43; Ran et al., Nat Protoc. 2013November; 8(11):2281-308; Ran et al., Cell. 2013 Sep. 12; 154(6):1380-9;Upadhyay et al., G3 (Bethesda). 2013 Dec. 9; 3(12):2233-8; Walsh et al.,Proc Natl Acad Sci USA. 2013 Sep. 24; 110(39):15514-5; Xie et al., MolPlant. 2013 Oct. 9; Yang et al., Cell. 2013 Sep. 12; 154(6):1370-9;Briner et al., Mol Cell. 2014 Oct. 23; 56(2):333-9; and U.S. patents andpatent applications: U.S. Pat. Nos. 8,906,616; 8,895,308; 8,889,418;8,889,356; 8,871,445; 8,865,406; 8,795,965; 8,771,945; 8,697,359;20140068797; 20140170753; 20140179006; 20140179770; 20140186843;20140186919; 20140186958; 20140189896; 20140227787; 20140234972;20140242664; 20140242699; 20140242700; 20140242702; 20140248702;20140256046; 20140273037; 20140273226; 20140273230; 20140273231;20140273232; 20140273233; 20140273234; 20140273235; 20140287938;20140295556; 20140295557; 20140298547; 20140304853; 20140309487;20140310828; 20140310830; 20140315985; 20140335063; 20140335620;20140342456; 20140342457; 20140342458; 20140349400; 20140349405;20140356867; 20140356956; 20140356958; 20140356959; 20140357523;20140357530; 20140364333; and 20140377868; all of which are herebyincorporated by reference in their entirety.

A Cas9 protein (e.g., a variant Cas9, a dCas9, a nickase Cas9, achimeric Cas9, etc.) can be introduced into a cell using any convenientmethod. For example, the protein can be introduced as a nucleic acid(e.g., DNA or RNA) (e.g., an mRNA, an expression vector, a plasmid, avirus, etc.) encoding the Cas9 protein or can be introduced into a celldirectly as protein (e.g., in some cases already complexed with a guideRNA to form a ribonucleoprotein complex, sometimes referred to as anRNP). Thus, for example, a Cas9 fusion protein can be introduced intocells as RNA. Methods of introducing RNA into cells are known in the artand may include, for example, direct injection, transfection, or anyother method used for the introduction of DNA. A Cas9 fusion protein mayinstead be provided to cells as a polypeptide. Such a polypeptide mayoptionally be fused to a polypeptide domain that increases solubility ofthe product. The domain may be linked to the polypeptide through adefined protease cleavage site, e.g. a TEV sequence, which is cleaved byTEV protease. The linker may also include one or more flexiblesequences, e.g. from 1 to 10 glycine residues. In some embodiments, thecleavage of the fusion protein is performed in a buffer that maintainssolubility of the product, e.g. in the presence of from 0.5 to 2 M urea,in the presence of polypeptides and/or polynucleotides that increasesolubility, and the like. Domains of interest include endosomolyticdomains, e.g. influenza HA domain; and other polypeptides that aid inproduction, e.g. IF2 domain, GST domain, GRPE domain, and the like. Thepolypeptide may be formulated for improved stability. For example, thepeptides may be PEGylated, where the polyethyleneoxy group provides forenhanced lifetime in the blood stream.

Methods of Genome Editing Using an Asymmetric Donor DNA

The present disclosure provides methods of editing DNA of a eukaryoticcell, the methods generally introducing into the cell a Cas9 polypeptide(or a nucleic acid comprising a nucleotide sequence encoding a Cas9polypeptide), a Cas9 guide RNA (or a nucleic acid comprising anucleotide sequence encoding a Cas9 guide RNA), and an asymmetricdouble-stranded or single-stranded donor DNA.

The present disclosure provides a method of editing genomic DNA of aeukaryotic cell, where the genomic DNA comprises a target strand and anon-target strand. In some embodiments, the method comprises introducinginto the cell: (a) a Cas9 guide RNA, or one or more nucleic acidsencoding the Cas9 guide RNA, where the Cas9 guide RNA hybridizes to atarget sequence of the target strand of the genomic DNA; (b) anasymmetric double stranded or single stranded donor DNA moleculecomprising a 5′ homology arm and a 3′ homology arm, where the 3′homology arm is 20 to 50 nucleotides in length, is shorter than the 5′homology arm, and comprises at least 10 consecutive nucleotides of saidtarget sequence; and (c) a Cas9 protein or a nucleic acid encoding theCas9 protein, where (i) the Cas9 protein forms a complex with the Cas9guide RNA thereby guiding the Cas9 protein to said target sequence, (ii)the 3′ homology arm of the donor DNA molecule hybridizes to thenon-target strand of the genomic DNA, and (iii) a nucleotide sequence ofthe donor DNA molecule is incorporated into the genomic DNA. Forexample, see FIG. 14A-14B and FIG. 15A-15B.

The present disclosure provides a method of editing, in a eukaryoticcell, a double stranded genomic DNA that comprises: (1) a target strandcomprising a target sequence, and (2) a non-target strand comprising: atarget-sequence-complement that is complementary to the target sequenceof the target strand, and a protospacer adjacent motif (PAM) sequencethat is immediately adjacent and 3′ of the target-sequence-complement,where the method comprises introducing into the cell: (a) a Cas9 guideRNA, or a nucleic acid encoding said Cas9 guide RNA, wherein the Cas9guide RNA comprises a guide sequence that hybridizes to the targetsequence of said target strand; (b) a Cas9 protein, or a nucleic acidencoding said Cas9 protein, wherein the Cas9 protein forms a complexwith the Cas9 guide RNA and is thereby targeted to the target sequenceof said target strand, wherein the Cas9 protein cleaves at least thenon-target strand of the genomic DNA, within saidtarget-sequence-complement, into a PAM distal non-target DNA strand anda PAM proximal non-target DNA strand; and (c) an asymmetric doublestranded or single stranded donor DNA molecule comprising: (i) a firsthomology arm 20 to 50 nucleotides in length that comprises a nucleotidesequence that hybridizes to the PAM distal non-target DNA strand, and(ii) a second homology arm that is 70 to 110 nucleotides in length, is5′ of the first homology arm, and comprises a nucleotide sequence thatis complementary to the PAM proximal non-target DNA strand, wherein anucleotide sequence of the donor DNA molecule is incorporated into thegenomic DNA. For example, see FIG. 14A-14B and FIG. 15A-15B.

In carrying out a method of the present disclosure for editing a targetDNA in a eukaryotic cell, where the donor DNA is an asymmetric donor DNAof the present disclosure, the Cas9 polypeptide can be a ‘dead’ Cas9(dCas9); a Cas9 nickase, where the nickase cleaves the non-target strand(e.g., a nickase with a functional RuvC domain); a Cas9 nickase, wherethe nickase cleaves target strand (e.g., a nickase with a functional HNHdomain) Thus, in some cases, the Cas9 is a ‘dead’ Cas9 (dCas9), and yetgenome editing still proceeds despite the lack of cleavage (either asingle strand or double strand break in the target DNA). In some cases,the Cas9 is a nickase with a functional RuvC domain (cleaves thenon-target strand). In some cases, the Cas9 is nickase with a functionalHNH domain (cleaves the target strand). In some cases, the Cas9 includesa functional RuvC domain and a functional HNH domain, and cleaves boththe target and non-target strands.

A method of the present disclosure for genome editing, using anasymmetric donor DNA, provides for increased homology-directed repair(HDR) compared to a method that does not involve use of an asymmetricdonor DNA. For example, in some cases, a method of the presentdisclosure for genome editing, using an asymmetric donor DNA, providesfor an at least 10%, at least 25%, at least 50%, at least 100%, at least2.5-fold, at least 5-fold, at least 10-fold, at least 20-fold, at least30-fold, at least 40-fold, or at least 50-fold increase in HDR, comparedwith the rate of HDR when the method is carried out using a symmetricaldonor DNA template.

In some cases, a method of the present disclosure for genome editing,using an asymmetric donor DNA, provides for an at least 10%, at least25%, at least 50%, at least 100%, at least 2.5-fold, at least 5-fold, atleast 10-fold, at least 20-fold, at least 30-fold, at least 40-fold, orat least 50-fold increase in the number of target genomic DNA moleculesthat undergo HDR, compared with the number of target genomic DNAmolecule that undergo HDR when the method is carried out using asymmetrical donor DNA template.

In some cases, a method of the present disclosure for genome editing,using an asymmetric donor DNA, provides for an at least 10%, at least25%, at least 50%, at least 100%, at least 2.5-fold, at least 5-fold, atleast 10-fold, at least 20-fold, at least 30-fold, at least 40-fold, orat least 50-fold increase in the number of cells of a targeted cellpopulation that undergo HDR, compared with the number of cells of atargeted cell population that undergo HDR when the method is carried outusing a symmetrical donor DNA template.

In some cases, a method of the present disclosure for genome editing,using an asymmetric donor DNA, provides for an increased ratio of HDR toNHEJ, compared to the ratio of HDR to NHEJ when the method is carriedout using a symmetrical donor DNA template. For example, in some cases,a method of the present disclosure for genome editing, using anasymmetric donor DNA, provides for a ratio of HDR to NHEJ of at least1:5, at least 1:4, at least 1:3, at least 1:2, at least 1:2, at least2:1, at least 3:1, at least 4:1, at least 5:1, at least 10:1, or morethan 10:1.

Suitable asymmetric donor DNA molecules, Cas9 polypeptides, and guideRNAs are as described above.

The present disclosure provides methods of editing DNA of a eukaryoticcell, the methods generally introducing into the cell a Cas9 polypeptide(or a nucleic acid comprising a nucleotide sequence encoding a Cas9polypeptide, a Cas9 guide RNA (or a nucleic acid comprising a nucleotidesequence encoding a Cas9 guide RNA), and an asymmetric double-strandedor single-stranded donor DNA. The eukaryotic cell can be referred to asa “target eukaryotic cell. Suitable target eukaryotic cells include invitro eukaryotic cells, ex vivo eukaryotic cells, and in vivo eukaryoticcells. In some cases, a target eukaryotic cell is an in vitro eukaryoticcell. In some cases, a target eukaryotic cell is an in vivo eukaryoticcell. In some cases, a target eukaryotic cell is an ex vivo eukaryoticcell.

Suitable eukaryotic cells include a cell of a single-cell eukaryoticorganism; a plant cell; an algal cell, e.g., Botryococcus braunii,Chlamydomonas reinhardtii, Nannochloropsis gaditana, Chlorellapyrenoidosa, Sargassum patens, C. agardh, and the like; a fungal cell(e.g., a yeast cell); an animal cell; a cell of an invertebrate animal(e.g. fruit fly, cnidarian, echinoderm, nematode, etc.); a cell of avertebrate animal (e.g., fish, amphibian, reptile, bird, mammal); a cellof a mammal (e.g., a cell from a rodent such as a mouse or a rat, a cellfrom a non-human primate, a cell from a human, etc.); and the like.

A suitable eukaryotic cell can be a stem cell (e.g. an embryonic stem(ES) cell, an induced pluripotent stem (iPS) cell); a germ cell; asomatic cell, e.g. a fibroblast, a hematopoietic cell, a neuron, amuscle cell, a bone cell, a hepatocyte, a pancreatic cell; an in vitroor in vivo embryonic cell of an embryo at any stage, e.g., a 1-cell,2-cell, 4-cell, 8-cell, etc. stage zebrafish embryo; etc.). Cells may befrom established cell lines or they may be primary cells, where “primarycells”, “primary cell lines”, and “primary cultures” are usedinterchangeably herein to refer to cells and cells cultures that havebeen derived from a subject and allowed to grow in vitro for a limitednumber of passages, i.e. splittings, of the culture. For example,primary cultures include cultures that may have been passaged 0 times, 1time, 2 times, 4 times, 5 times, 10 times, or 15 times, but not enoughtimes go through the crisis stage. Primary cell lines can be maintainedfor fewer than 10 passages in vitro. Suitable eukaryotic cells includeunicellular organisms, or cells grown in culture.

If the cells are primary cells, they may be harvest from an organism(e.g., an individual) by any convenient method. For example, leukocytesmay be conveniently harvested by apheresis, leukocytapheresis, densitygradient separation, etc., while cells from tissues such as skin,muscle, bone marrow, spleen, liver, pancreas, lung, intestine, stomach,etc. are most conveniently harvested by biopsy. An appropriate solutionmay be used for dispersion or suspension of the harvested cells. Suchsolution will generally be a balanced salt solution, e.g. normal saline,phosphate-buffered saline (PBS), Hank's balanced salt solution, etc.,conveniently supplemented with fetal calf serum or other naturallyoccurring factors, in conjunction with an acceptable buffer at lowconcentration, e.g., from 5-25 mM. Convenient buffers include HEPES,phosphate buffers, lactate buffers, etc. The cells may be usedimmediately, or they may be stored, frozen, for long periods of time,being thawed and capable of being reused. In such cases, the cells canbe frozen in 10% dimethyl sulfoxide (DMSO), 50% serum, 40% bufferedmedium, or some other such solution as is commonly used in the art topreserve cells at such freezing temperatures, and thawed in a manner ascommonly known in the art for thawing frozen cultured cells.

In some embodiments, a target eukaryotic cell is in vitro. In someembodiments, a target eukaryotic cell is in vivo. In some embodiments, atarget eukaryotic cell is a plant cell or is derived from a plant cell.In some embodiments, a target eukaryotic cell is an animal cell or isderived from an animal cell. In some embodiments, target eukaryotic cellis an invertebrate cell or is derived from an invertebrate cell. In someembodiments, a target eukaryotic cell is a vertebrate cell or is derivedfrom a vertebrate cell. In some embodiments, a target eukaryotic cell isa mammalian cell or is derived from a mammalian cell. In someembodiments, a target eukaryotic cell is a rodent cell or is derivedfrom a rodent cell. In some embodiments, a target eukaryotic cell is ahuman cell or is derived from a human cell.

A suitable target eukaryotic cell includes a cell that harbors a geneticdefect in its genome, which genetic defect is to be corrected using amethod of the present disclosure.

An asymmetric donor DNA system of the present disclosure can beintroduced into a host cell by any of a variety of well-known methods.As discussed above, a method of the present disclosure for editing thegenome of a eukaryotic cell comprises introducing into the cell: a) aCas9 polypeptide (or a nucleic acid comprising a nucleotide sequenceencoding a Cas9 polypeptide; b) a Cas9 guide RNA (or a nucleic acidcomprising a nucleotide sequence encoding a Cas9 guide RNA); and c) anasymmetric double-stranded or single-stranded donor DNA. Thus, in somecases, the method comprises introducing into a target eukaryotic cell:a) a Cas9 polypeptide; b) a Cas9 guide RNA; and c) an asymmetric donorDNA. In other cases, the method comprises introducing into a targeteukaryotic cell: a) a Cas9 polypeptide; b) a nucleic acid comprising anucleotide sequence encoding a Cas9 guide RNA; and c) an asymmetricdouble-stranded or single-stranded donor DNA. In other instances, themethod comprises introducing into a target eukaryotic cell: a) a nucleicacid encoding a Cas9 polypeptide; b) a Cas9 guide RNA; and c) anasymmetric double-stranded or single-stranded donor DNA. In other cases,the method comprises introducing into a target eukaryotic cell: a) anucleic acid comprising a nucleotide sequence encoding a Cas9polypeptide; b) ; b) a nucleic acid comprising a nucleotide sequenceencoding a Cas9 guide RNA; and c) an asymmetric double-stranded orsingle-stranded donor DNA. In any of these embodiments, the Cas9polypeptide can be an enzymatically active Cas9 polypeptide. In any ofthese embodiments, the Cas9 polypeptide can be an enzymatically inactiveCas9 polypeptide (a “dead Cas9” polypeptide). In any of theseembodiments, the Cas9 polypeptide can be a “nickase” Cas9 polypeptide.

Methods of introducing a nucleic acid into a host cell are known in theart, and any known method can be used to introduce a nucleic acid (e.g.,an expression construct) into a target cell. Suitable methods include,include e.g., viral infection, transfection, conjugation, protoplastfusion, lipofection, electroporation, calcium phosphate precipitation,polyethyleneimine (PEI)-mediated transfection, DEAE-dextran mediatedtransfection, liposome-mediated transfection, particle gun technology,calcium phosphate precipitation, direct micro injection,nanoparticle-mediated nucleic acid delivery (see, e.g., Panyam et., alAdv Drug Deliv Rev. 2012 Sep. 13. pii: S0169-409X(12)00283-9. doi:10.1016/j.addr.2012.09.023), and the like.

In some cases, a Cas9 polypeptide is provided as a nucleic acid (e.g.,an mRNA, a DNA, a plasmid, an expression vector, a viral vector, etc.)that encodes the Cas9 fusion polypeptide. In some cases, the Cas9polypeptide is provided directly as a protein. A Cas9 polypeptide can beintroduced into a cell (provided to the cell) by any convenient method;such methods are known to those of ordinary skill in the art. As anillustrative example, a Cas9 polypeptide can be injected directly into acell (e.g., with or without nucleic acid encoding a Cas9 guide RNA andwith or without a donor polynucleotide). As another example, a preformedcomplex of a Cas9 polypeptide and a Cas9 guide RNA (an RNP) can beintroduced into a cell (e.g., via nucleofection; via a proteintransduction domain (PTD) conjugated to one or more components, e.g.,conjugated to the Cas9 protein, conjugated to a guide RNA, conjugated toa Cas9 polypeptide and a guide RNA; etc.).

Where a Cas9 polypeptide and/or a Cas9 guide RNA are introduced into atarget eukaryotic cell as a nucleic acid comprising a nucleotidesequence encoding the Cas9 polypeptide and/or the Cas9 guide RNA, thenucleotide sequence encoding the Cas9 polypeptide and/or the Cas9 guideRNA can be operably linked to a transcriptional control element (e.g., apromoter) that is functional in a eukaryotic cell. In some cases, thepromoter is a constitutively active promoter. In some cases, thepromoter is an inducible promoter. In some cases, the promoter is a celltype-specific promoter. Suitable known promoters can be any knownpromoter and include constitutively active promoters (e.g., CMVpromoter), inducible promoters (e.g., heat shock promoter,Tetracycline-regulated promoter, Steroid-regulated promoter,Metal-regulated promoter, estrogen receptor-regulated promoter, etc.),spatially restricted and/or temporally restricted promoters (e.g., atissue specific promoter, a cell type specific promoter, etc.), etc.

System for Genome Editing Comprising Dead Cas9

The present disclosure provides a system for editing genomic DNA in aeukaryotic cell, the system comprising: a) a Cas9 polypeptide thatexhibits reduced enzymatic activity; b) a Cas9 guide RNA, or a nucleicacid comprising a nucleotide encoding a Cas9 guide RNA; and c) adouble-stranded or single-stranded donor DNA template. For simplicity, aCas9 polypeptide that exhibits reduced enzymatic activity is referred toherein as a “dead Cas9” polypeptide or a “dCas9” polypeptide. In somecases, the donor DNA template is an asymmetric DNA donor template, asdescribed above.

The present disclosure provides a system for editing genomic DNA in aeukaryotic cell, the system comprising: (a) a dead Cas9 (dCas9) protein,or a nucleic acid encoding said dCas9 protein, where the dCas9 proteinlacks catalytically active RuvC and HNH domains; (b) a Cas9 guide RNA,or one or more nucleic acids comprising a nucleotide sequence encodingsaid Cas9 guide RNA, where the Cas9 guide RNA comprises a guide sequencethat is complementary to a target sequence of a target genomic DNA of aeukaryotic cell; and (c) a corresponding double stranded or singlestranded donor DNA template molecule comprising at least 10 consecutivenucleotides of said target sequence.

A dCas9 polypeptide for use in the method does not cleave either strandof a target DNA. Interaction of the dCas9, the guide RNA, and the donorDNA is illustrated schematically in FIG. 15A-15B.

dCas9

As noted above, in some cases, a variant Cas9 polypeptide has a reducedability to cleave both the complementary and the non-complementarystrands of a double stranded target nucleic acid. As a non-limitingexample, in some cases, the variant Cas9 polypeptide harbors mutationsin both the RuvC and HNH domains (e.g., mutations at both the D10 andH840 positions, e.g., both the D10A and the H840A mutations, or thecorresponding mutations of any of the proteins set forth as SEQ ID NOs:5-816) such that the polypeptide has a reduced ability to cleave boththe complementary and the non-complementary strands of a double strandedtarget nucleic acid. Such a Cas9 polypeptide can have a reduced abilityto cleave a target nucleic acid but retain the ability to bind a targetnucleic acid.

Other residues can be mutated to achieve the above effects (i.e.inactivate one or the other nuclease domains). As non-limiting examples,residues D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986,and/or A987 of the Cas9 protein set forth in SEQ ID NO: 5 (or thecorresponding residues of any of the proteins set forth as SEQ ID NOs:6-816) can be altered (i.e., substituted) (e.g., see Table 1 for moreinformation regarding the conservation of Cas9 amino acid residues,e.g., those that are highly conserved and are included in the motifslisted in Table 1). Also, mutations other than alanine substitutions aresuitable.

In some embodiments, a variant Cas9 polypeptide that has reducedcatalytic activity (e.g., when a Cas9 protein has a D10, G12, G17, E762,H840, N854, N863, H982, H983, A984, D986, and/or a A987 mutation, e.g.,D 10A, G12A, G17A, E762A, H840A, N854A, N863A, H982A, H983A, A984A,and/or D986A), the variant Cas9 polypeptide can still bind to targetnucleic acid in a site-specific manner (because it is still guided to atarget nucleic acid sequence by a guide nucleic acid) as long as itretains the ability to interact with the guide nucleic acid.

Donor DNA

In some cases, the donor DNA template is double stranded. In some cases,the donor DNA template molecule comprises a nucleotide sequence that isa heterologous to the target genomic DNA in the eukaryotic cell.

In some cases, the donor DNA molecule comprises one or more syntheticmodifications selected from: a base modification, a sugar modification,and a backbone modification. Suitable base modifications, sugarmodifications, and backbone modifications are as described above. Forexample, in some cases, the donor DNA molecule comprises aphosphorothioate linkage.

In some cases, the contacting occurs under conditions that arepermissive for nonhomologous end joining or homology-directed repair. Insome cases, the donor polynucleotide, a portion of the donorpolynucleotide, a copy of the donor polynucleotide, or a portion of acopy of the donor polynucleotide integrates into the target DNA.

In some cases, a Cas9 guide RNA and a dCas9 polypeptide arecoadministered (e.g., contacted with a target nucleic acid, administeredto cells, etc.) with a donor polynucleotide sequence that includes atleast a segment with homology to the target DNA sequence. In such cases,a method of the present disclosure may be used to add, i.e. insert orreplace, nucleic acid material to a target DNA sequence (e.g. to “knockin” a nucleic acid that encodes for a protein, an siRNA, an miRNA,etc.), to add a tag (e.g., 6× His, a fluorescent protein (e.g., a greenfluorescent protein; a yellow fluorescent protein, etc.), hemagglutinin(HA), FLAG, etc.), to add a regulatory sequence to a gene (e.g.promoter, polyadenylation signal, internal ribosome entry sequence(IRES), 2A peptide, start codon, stop codon, splice signal, localizationsignal, etc.), to modify a nucleic acid sequence (e.g., introduce amutation), and the like. As such, a complex comprising a Cas9 guide RNAand a dCas9 polypeptide is useful in any in vitro, ex vivo, or in vivoapplication in which it is desirable to modify a target DNA in asite-specific, i.e. “targeted”, way, for example gene knock-out, geneknock-in, gene editing, gene tagging, etc., as used in, for example,gene therapy, e.g. to treat a disease or as an antiviral,antipathogenic, or anticancer therapeutic, the production of geneticallymodified organisms in agriculture, the large scale production ofproteins by cells for therapeutic, diagnostic, or research purposes, theinduction of iPS cells, biological research, the targeting of genes ofpathogens for deletion or replacement, etc.

In applications in which it is desirable to insert a polynucleotidesequence into a target DNA sequence, a polynucleotide comprising a donorsequence to be inserted is also provided to the cell. By a “donorsequence” or “donor polynucleotide” it is meant a nucleic acid sequenceto be inserted at the site bound by a dCas9 polypeptide (the “targetsite”). The donor polynucleotide will contain sufficient homology to agenomic sequence at the target site, e.g. 70%, 80%, 85%, 90%, 95%, or100% homology with the nucleotide sequences flanking the target site,e.g. within about 50 bases or less of the target site, e.g. within about30 bases, within about 15 bases, within about 10 bases, within about 5bases, or immediately flanking the target site, to supporthomology-directed repair between it and the genomic sequence to which itbears homology. Approximately 25, 50, 100, or 200 nucleotides, or morethan 200 nucleotides, of sequence homology between a donor and a genomicsequence (or any integral value between 10 and 200 nucleotides, or more)will support homology-directed repair. Donor sequences can be of anylength, e.g. 10 nucleotides or more, 50 nucleotides or more, 100nucleotides or more, 250 nucleotides or more, 500 nucleotides or more,1000 nucleotides or more, 5000 nucleotides or more, etc.

The donor sequence is typically not identical to the genomic sequencethat it replaces. Rather, the donor sequence may contain at least one ormore single base changes, insertions, deletions, inversions orrearrangements with respect to the genomic sequence, so long assufficient homology is present to support homology-directed repair. Insome embodiments, the donor sequence comprises a non-homologous sequenceflanked by two regions of homology, such that homology-directed repairbetween the target DNA region and the two flanking sequences results ininsertion of the non-homologous sequence at the target region. Donorsequences may also comprise a vector backbone containing sequences thatare not homologous to the DNA region of interest and that are notintended for insertion into the DNA region of interest. Generally, thehomologous region(s) of a donor sequence will have at least 50% sequenceidentity to a genomic sequence with which recombination is desired. Incertain embodiments, 60%, 70%, 80%, 90%, 95%, 98%, 99%, or 99.9%sequence identity is present. Any value between 1% and 100% sequenceidentity can be present, depending upon the length of the donorpolynucleotide.

The donor sequence may comprise certain sequence differences as comparedto the genomic sequence, e.g. restriction sites, nucleotidepolymorphisms, selectable markers (e.g., drug resistance genes,fluorescent proteins, enzymes etc.), etc., which may be used to assessfor successful insertion of the donor sequence at the target site or insome cases may be used for other purposes (e.g., to signify expressionat the targeted genomic locus). In some cases, if located in a codingregion, such nucleotide sequence differences will not change the aminoacid sequence, or will make silent amino acid changes (i.e., changeswhich do not affect the structure or function of the protein).Alternatively, these sequences differences may include flankingrecombination sequences such as FLPs, loxP sequences, or the like, thatcan be activated at a later time for removal of the marker sequence.

The donor sequence may be provided to the cell as single-stranded DNA,single-stranded RNA, double-stranded DNA, or double-stranded RNA. It maybe introduced into a cell in linear or circular form. If introduced inlinear form, the ends of the donor sequence may be protected (e.g., fromexonucleolytic degradation) by methods known to those of skill in theart. For example, one or more dideoxynucleotide residues are added tothe 3′ terminus of a linear molecule and/or self-complementaryoligonucleotides are ligated to one or both ends. See, for example,Chang et al. (1987) Proc. Natl. Acad Sci USA 84:4959-4963; Nehls et al.(1996) Science 272:886-889. Additional methods for protecting exogenouspolynucleotides from degradation include, but are not limited to,addition of terminal amino group(s) and the use of modifiedinternucleotide linkages such as, for example, phosphorothioates,phosphoramidates, and 0-methyl ribose or deoxyribose residues. As analternative to protecting the termini of a linear donor sequence,additional lengths of sequence may be included outside of the regions ofhomology that can be degraded without impacting recombination. A donorsequence can be introduced into a cell as part of a vector moleculehaving additional sequences such as, for example, replication origins,promoters and genes encoding antibiotic resistance. Moreover, donorsequences can be introduced as naked nucleic acid, as nucleic acidcomplexed with an agent such as a liposome or poloxamer, or can bedelivered by viruses (e.g., adenovirus, AAV), as described above fornucleic acids encoding a Cas9 guide RNA and/or a dCas9 polypeptideand/or donor polynucleotide.

Asymmetric Donor DNA

In some cases, the donor DNA is an asymmetric donor DNA, as describedabove. For example, in some cases, the donor DNA is an asymmetric donorDNA molecule comprising a 5′ homology arm and a 3′ homology arm, whereinthe 3′ homology arm is 20 to 50 nucleotides in length, is shorter thanthe 5′ homology arm, and comprises the at least 10 consecutivenucleotides of said target sequence. In some cases, the donor DNAtemplate molecule is single stranded. In some cases, the donor DNAtemplate molecule is double stranded. Asymmetric donor DNA is asdescribed above.

Guide RNA

In some cases, a dCas9 system of the present disclosure comprises onlyone guide RNA, which may be a single-guide RNA or a dual-guide RNA.Suitable guide RNAs are as described above.

In some cases, the guide RNA comprises one or more syntheticmodifications selected from: a base modification, a sugar modification,and a backbone modification. Suitable base modifications, sugarmodifications, and backbone modifications are as described above. Forexample, in some cases, the guide RNA comprises a phosphorothioatelinkage.

In some cases, a dCas9 system of the present disclosure comprises two ormore Cas9 guide RNAs, or one or more nucleic acids encoding the two ormore Cas9 guide RNAs, wherein the guide sequences of the two or moreCas9 guide RNAs are complementary to target sequences that do notoverlap with one another and are each separated from one another by1-100 nucleotides; e.g., are separated from one another by from 1 nt to5 nt, from 5 nt to 10 nt, from 10 nt to 15 nt, from 15 nt to 20 nt, from20 nt to 25 nt, from 25 nt to 30 nt, from 30 nt to 40 nt, from 40 nt to50 nt, from 50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt,from 80 nt to 90 nt, or from 90 nt to 100 nt. In some cases, the two ormore Cas9 guide RNAs are complementary to target sequences that overlapwith the donor DNA template molecule.

In some cases, a dCas9 system of the present disclosure comprises threeor more Cas9 guide RNAs, or one or more nucleic acids encoding the threeor more Cas9 guide RNAs, wherein the guide sequences of the three ormore Cas9 guide RNAs are complementary to target sequences that do notoverlap with one another and are each separated from one another by1-100 nucleotides; e.g., are separated from one another by from 1 nt to5 nt, from 5 nt to 10 nt, from 10 nt to 15 nt, from 15 nt to 20 nt, from20 nt to 25 nt, from 25 nt to 30 nt, from 30 nt to 40 nt, from 40 nt to50 nt, from 50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt,from 80 nt to 90 nt, or from 90 nt to 100 nt. In some cases, the threeor more Cas9 guide RNAs are complementary to target sequences thatoverlap with the donor DNA template molecule.

In some cases, a dCas9 system of the present disclosure comprises fouror more Cas9 guide RNAs, or one or more nucleic acids encoding the fouror more Cas9 guide RNAs, wherein the guide sequences of the four or moreCas9 guide RNAs are complementary to target sequences that do notoverlap with one another and are each separated from one another by1-100 nucleotides; e.g., are separated from one another by from 1 nt to5 nt, from 5 nt to 10 nt, from 10 nt to 15 nt, from 15 nt to 20 nt, from20 nt to 25 nt, from 25 nt to 30 nt, from 30 nt to 40 nt, from 40 nt to50 nt, from 50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt,from 80 nt to 90 nt, or from 90 nt to 100 nt. In some cases, the four ormore Cas9 guide RNAs are complementary to target sequences that overlapwith the donor DNA template molecule.

Cells

A system of the present disclosure that comprises a dCas9 polypeptide, aguide RNA (or one or more nucleic acids comprising nucleotide sequencesencoding the guide RNA), and a donor DNA (or a nucleic acid comprising anucleotide sequence encoding the donor DNA) can in some cases furtherinclude a eukaryotic cell comprising said target genomic DNA.

Suitable eukaryotic cells include, but are not limited to, plant cells;algal cells; fungal cells; unicellular eukaryotic organisms (includingpathogenic unicellular eukaryotic organisms); reptile cells; amphibiancells; insect cells; arthropod cells; and mammalian cells (e.g.,ungulate cells (e.g., bovine cells, ovine cells, caprine cells, equinecells, etc.); feline cells; canine cells; non-human primate cells; humancells). Suitable cells include stem cells, including embryonic stemcells and adult stem cells; progenitor cells; hepatic cells;lymphocytes; oligodendrocytes; neurons; and the like.

Methods of Genome Editing Using Dead Cas9

The present disclosure provides a method of editing a target genomic DNAin a eukaryotic cell. The method generally involves introducing into thecell: a) a dead Cas9 polypeptide, or a nucleic acid comprising anucleotide encoding a dead Cas9 polypeptide; b) a Cas9 guide RNA, or oneor more nucleic acids comprising nucleotide sequences encoding a Cas9guide RNA; and c) a single-stranded or double-stranded DNA donortemplate comprising at least 10 consecutive nucleotides of the a targetsequence in the target genomic DNA. Suitable eukaryotic cells are asdescribed above. Suitable guide RNAs are as described above. Suitabledonor DNA templates are as described above. In some cases, the methodcomprises introducing into an in vitro eukaryotic cell: a) a dead Cas9polypeptide, or a nucleic acid comprising a nucleotide encoding a deadCas9 polypeptide; b) a Cas9 guide RNA, or one or more nucleic acidscomprising nucleotide sequences encoding a Cas9 guide RNA; and c) asingle-stranded or double-stranded DNA donor template comprising atleast 10 consecutive nucleotides of the a target sequence in the targetgenomic DNA. In some cases, the method comprises introducing into an invivo eukaryotic cell: a) a dead Cas9 polypeptide, or a nucleic acidcomprising a nucleotide encoding a dead Cas9 polypeptide; b) a Cas9guide RNA, or one or more nucleic acids comprising nucleotide sequencesencoding a Cas9 guide RNA; and c) a single-stranded or double-strandedDNA donor template comprising at least 10 consecutive nucleotides of thea target sequence in the target genomic DNA.

The present disclosure provides a method of editing a target genomic DNAof a eukaryotic cell, the method comprising introducing into theeukaryotic cell: (a) a dead Cas9 (dCas9) protein, or a nucleic acidencoding said dCas9 protein, wherein the dCas9 protein does not cleavethe target genomic DNA; (b) a Cas9 guide RNA, or one or more nucleicacids encoding said Cas9 guide RNA, wherein the Cas9 guide RNAhybridizes to a target sequence of the target genomic DNA; and (c) acorresponding double stranded or single stranded donor DNA templatemolecule comprising at least 10 consecutive nucleotides of said targetsequence, where the dCas9 protein forms a complex with the Cas9 guideRNA thereby guiding the dCas9 protein to said target sequence, andwherein a nucleotide sequence of the donor DNA molecule is incorporatedinto the genomic DNA.

In some cases, the method comprises introducing comprises introducingtwo or more Cas9 guide RNAs, or one or more nucleic acids encoding thetwo or more Cas9 guide RNAs into a eukaryotic cell comprising a targetDNA, where the two or more Cas9 guide RNAs hybridize to target sequencesthat do not overlap with one another and are each separated from oneanother by from 1-100 nucleotides. In some cases, the method comprisesintroducing comprises introducing two or more Cas9 guide RNAs, or one ormore nucleic acids encoding the two or more Cas9 guide RNAs into aeukaryotic cell comprising a target DNA, where the two or more Cas9guide RNAs hybridize to target sequences that do not overlap with oneanother and are each separated from one another by from 1 nt to 5 nt,from 5 nt to 10 nt, from 10 nt to 15 nt, from 15 nt to 20 nt, from 20 ntto 25 nt, from 25 nt to 30 nt, from 30 nt to 40 nt, from 40 nt to 50 nt,from 50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt, from 80nt to 90 nt, or from 90 nt to 100 nt. In some cases, the two or moreCas9 guide RNAs hybridize to target sequences that overlap with thedonor DNA template molecule.

In some cases, the method comprises introducing comprises introducingthree or more Cas9 guide RNAs, or one or more nucleic acids encoding thethree or more Cas9 guide RNAs into a eukaryotic cell comprising a targetDNA, where the three or more Cas9 guide RNAs hybridize to targetsequences that do not overlap with one another and are each separatedfrom one another by 1-100 nucleotides. In some cases, the methodcomprises introducing comprises introducing three or more Cas9 guideRNAs, or one or more nucleic acids encoding the three or more Cas9 guideRNAs into a eukaryotic cell comprising a target DNA, where the three ormore Cas9 guide RNAs hybridize to target sequences that do not overlapwith one another and are each separated from one another by from 1 nt to5 nt, from 5 nt to 10 nt, from 10 nt to 15 nt, from 15 nt to 20 nt, from20 nt to 25 nt, from 25 nt to 30 nt, from 30 nt to 40 nt, from 40 nt to50 nt, from 50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt,from 80 nt to 90 nt, or from 90 nt to 100 nt. In some cases, the threeor more Cas9 guide RNAs hybridize to target sequences that overlap withthe donor DNA template molecule.

In some cases, the method comprises introducing comprises introducingfour or more Cas9 guide RNAs, or one or more nucleic acids encoding thefour or more Cas9 guide RNAs into a eukaryotic cell comprising a targetDNA, where the four or more Cas9 guide RNAs hybridize to targetsequences that do not overlap with one another and are each separatedfrom one another by 1-100 nucleotides. In some cases, the methodcomprises introducing comprises introducing four or more Cas9 guideRNAs, or one or more nucleic acids encoding the four or more Cas9 guideRNAs into a eukaryotic cell comprising a target DNA, where the four ormore Cas9 guide RNAs hybridize to target sequences that do not overlapwith one another and are each separated from one another by from 1 nt to5 nt, from 5 nt to 10 nt, from 10 nt to 15 nt, from 15 nt to 20 nt, from20 nt to 25 nt, from 25 nt to 30 nt, from 30 nt to 40 nt, from 40 nt to50 nt, from 50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt,from 80 nt to 90 nt, or from 90 nt to 100 nt. In some cases, the four ormore Cas9 guide RNAs hybridize to target sequences that overlap with thedonor DNA template molecule.

Suitable eukaryotic cells include, but are not limited to, plant cells;algal cells; fungal cells; unicellular eukaryotic organisms (includingpathogenic unicellular eukaryotic organisms); reptile cells; amphibiancells; insect cells; arthropod cells; and mammalian cells (e.g.,ungulate cells (e.g., bovine cells, ovine cells, caprine cells, equinecells, etc.); feline cells; canine cells; non-human primate cells; humancells). Suitable cells include stem cells, including embryonic stemcells and adult stem cells; progenitor cells; hepatic cells;lymphocytes; oligodendrocytes; neurons; and the like.

EXAMPLES

The following examples are put forth so as to provide those of ordinaryskill in the art with a complete disclosure and description of how tomake and use the present invention, and are not intended to limit thescope of what the inventors regard as their invention nor are theyintended to represent that the experiments below are all or the onlyexperiments performed. Efforts have been made to ensure accuracy withrespect to numbers used (e.g. amounts, temperature, etc.) but someexperimental errors and deviations should be accounted for. Unlessindicated otherwise, parts are parts by weight, molecular weight isweight average molecular weight, temperature is in degrees Celsius, andpressure is at or near atmospheric. Standard abbreviations may be used,e.g., bp, base pair(s); kb, kilobase(s); pl, picoliter(s); s or sec,second(s); min, minute(s); h or hr, hour(s); aa, amino acid(s); kb,kilobase(s); bp, base pair(s); nt, nucleotide(s); i.m.,intramuscular(ly); i.p., intraperitoneal(ly); s.c., subcutaneous(ly);and the like.

Example 1 Materials and Methods Cas9 and RNA Preparation

Cas9 (pCR1002), dCas9 (pCR1003), Cas9-2NLS (pCR1053), D10ACas9-2NLS(pCR1054), H840ACas9-2NLS (pCR1055), and dCas9-2NLS (pCR1056) werepurified by a combination of affinity, ion exchange, and size exclusionchromatographic steps as previously described²², except protein waseluted at 40 uM in 20 mM HEPES KOH pH 7.5, 5% glycerol, 150 mM KCl, 1 mMdithiothreitol (DTT). FIG. 18 contains the expressed sequences for eachof these vectors.

Single guide RNAs (sgRNAs) were generated by HiScribe (NEB E2050S) T7 invitro transcription using PCR-generated DNA as a template²² (“dx.doi”followed by “.org/10.17504/protocols” followed by “.io.dm749m”).Sequences for the sgRNA templates can be found in FIG. 17.

BioLayer Interferometry

The Octet RED384 BioLayer Interferometry machine and Streptavidin (SA)Biosensors are available from ForteBio (Menlo Park, Calif.). All stepswere performed in Reaction Buffer (20 mM Tris pH 7.0, 100 mM KCl, 5 mMMgCl₂, 1 mM DTT, 0.01% Tween, 50 μg/mL Heparin) at 37° C. Biosensorswere incubated with 300 nM double stranded DNA (Single ribonucleoprotein(RNP) Substrate) biotinylated on the 5′ terminus of the non-targetstrand for 180 seconds and free DNA was washed away. 200 nM Cas9(expressed from pCR1002) or dCas9 (expressed from pCR1003) were mixedwith a 20% molar excess of sgRNA and incubated for 10 minutes to formRNP. Biosensor tips conjugated to substrate DNA were incubated with RNPfor 300 seconds to load RNP. Biosensor-double stranded DNA (dsDNA)-RNPcomplexes were allowed to dissociate in Reaction Buffer for 3600seconds. Response curves for each biosensor were normalized againstbiosensors conjugated to DNA but without RNP (buffer-only control).Normalized response curves were processed using Octet software version 7to obtain reported kinetic values.

EMSA Assays

Nuclease and sgRNA were incubated in Reaction Buffer for 30 minutes toform RNP. Substrate DNA was added and RNP loading was allowed to takeplace for a defined interval (16 hours, equilibrium experiments; tenminutes, strand displacement/annealing experiments). Assembled RNP-dsDNAcomplexes were incubated with challenge DNA for the reported amount oftime. Standard reactions conditions were: Substrate DNA—100 nM, Cas9—500nM, sgRNA—500 nM, Challenge DNA—1500 nM. All reactions were performed at37° C. Sequences for the substrate and challenge oligonucleotides (IDT)can be found in FIG. 17.

Cell Lines

HEK293 cells were obtained from ATCC and verified mycoplasma-free (LonzaMycoalert LT-07). All cells were maintained in DMEM supplemented with10% FBS and 100 μg/mL Penicillin-Streptomycin.

Reporter Strain Construction

HEK293 cells were transduced with lentivirus expressing a BFP reporterconstruct under the EF1alpha promoter (Addgene Deposit Pending).Dilution cloning was used to isolate clonal populations with robustexpression of Blue fluorescent protein (BFP) (as measured by flowcytometry). Cell populations were periodically sorted on a BD FACSJAZZto maintain BFP expression levels.

Nucleofection Editing Experiments

100 pmoles of Cas9-2NLS (or variants) was diluted to a final volume of 5uL with Cas9 buffer (20 mM HEPES (pH 7.5), 150 mM KCl, 1 mM MgCl₂, 10%glycerol and 1 mM tris (2-carboxyethyl)phosphine (TCEP)) and mixedslowly into 5 uL of Cas9 buffer containing 120 pmoles of L2 sgRNA. Theresulting mixture was incubated for ten minutes at RT to allow RNPformation. 2×10⁵ HEK293 cells were harvested, washed once in PBS, andresuspended in 20 uL of SF nucleofection buffer (Lonza, Basel,Switzerland). 10 uL of RNP mixture, 100 pmoles of donor DNA, and cellsuspension were combined in a Lonza 4d strip nucleocuvette. Reactionmixtures were electroporated using setting DS150, incubated in thenucleocuvette at RT for ten minutes, and transferred to culture dishescontaining pre-warmed media (“dx.doi” followed by“.org/10.17504/protocols” followed by “.io.dm649d”). Editing outcomeswere measured four and seven days post-nucleofection by flow cytometry.Seven-day results are presented in the figures.

Terminal Transferase-qPCR Assay

667 pmoles of Cas9-2NLS (or variants) was diluted to a final volume of20 uL with Cas9 buffer and mixed slowly into 20 uL of Cas9 buffercontaining 800 pmoles of sgRNA. The resulting mixture was incubated forten minutes at RT to allow RNP formation. 4×10⁶ cells were harvested,washed once in PBS, and resuspended in 100 uL of SF nucleofection buffer(Lonza V4XC-2032). 50 uL of RNP mixture, 100 pmoles of donor DNA, andcell suspension were combined in a Lonza 4d strip nucleocuvette.Reaction mixtures were electroporated using setting DS150, incubated inthe nucleocuvette at RT for ten minutes, and transferred to culturedishes containing pre-warmed media. Cells were allowed to recover forthree hours, then fixed in 2% formaldehyde for 15 minutes at 4° C. andpermeabilized overnight in 70% ethanol at −20° C. 1×10⁶ cells wererehydrated in terminal transferase (TdT) buffer (Roche 03333574001) forten minutes at 37° C. and incubated with 800 U of TdT and 2 nmoles ofBiotin-16-dUTP (Roche 11093070910) for 30 minutes at 37° C. Labeledcells were resuspended in Lysis Buffer (50 mM Tris, pH 7.4, 500 mM NaCl,0.4% SDS, 5 mM EDTA, 1 mM DTT) and disrupted by sonication in a CovarisS220. Crosslinks were reversed by incubation for 4 hours at 65° C. andcell slurry was cleared by centrifugation at max speed for ten minutes.

Biotinylated DNA in the supernatant was bound to streptavidin MyOne C1Dynabeads (Life Technologies 65002) and washed 3× in Wash Buffer (5 mMTris, 0.5 mM EDTA, 1M NaCl). Non-biotinylated strands were dissociatedby incubation in 20 mM NaOH for ten minutes. The washed beads wereresuspended to a final concentration of 10 ug/uL in dH₂O. Twomicroliters of bead slurry was used as template for qPCR.

A total of three independent cultures were analyzed for each sgRNA on anEppendorf Nexus X1 qPCR machine using primers listed in FIG. 17. Eachreaction was performed using DyNAmo HS SYBR Green qPCR kit (FisherScientific F-410L) in a total volume of 20 μL with primers at a finalconcentration of 300 nM. Annealing was performed at 62° C. Foldenrichment of the assayed DNA segments over the un-labeled ACT1B locuswas calculated using the 2^(−ΔΔCt) method essentially as described²³.

Results

Bio-Layer Interferometry (BLI) was used to determine the in vitrokinetics of RNP interaction with substrate DNA under nativeconditions¹³. A biotinylated 55 base pair substrate DNA (λ1 describedin¹²) was immobilized to a streptavidin-coated BLI probe, and thebinding and dissociation of Cas9 or catalytically inactive dCas9 wasmeasured (FIG. 5A). Despite the ability of Cas9 to cleave the target DNA(FIG. 5E), it was observed that both proteins have identical affinitiesfor DNA (K_(D) 1.2±0.1 nM, FIG. 1A and FIG. 5B), as well as identicaloff rates of ˜5.0±0.3×10⁻⁵ s⁻¹, which equates to a lifetime of 5.5hours. The tight interaction of both RNP variants with substrate DNA wassgRNA-dependent and required a protospacer adjacent motif (PAM) (FIG.5C). To resolve whether Cas9 preferentially dissociates from one end ofcut duplex DNA under physiological conditions, substrate DNA on eachside of the nuclease cut site were labeled with a distinct fluorophoreand dissociation of cut fragments was monitored using an electrophoreticmobility shift assay (EMSA). By chasing with unlabeled duplex competitorDNA, the off rate of the PAM-distal side of the cut at 1×10⁻⁵ s⁻¹ andthe off-rate of the PAM-proximal side of the cut at 6×10⁻⁶ s⁻¹ wasdetermined (FIG. 1B and FIG. 5D), which is very similar to valuesmeasured by BLI. Cas9's lifetime on DNA is therefore approximately5-fold longer than the lower bound previously established by fluorescentmeans¹², and under native conditions symmetrically dissociates from thetarget duplex. The in vitro Cas9-DNA lifetime is similar to the timerequired to repair Cas9 lesions in mammalian cells (6-24 hours)¹⁴, butlonger than the time required to repair double strand breaks caused byionizing radiation (t_(1/2)=60 minutes)¹⁵. Thus, Cas9 may bind so stablyto substrate DNA that it conceals the underlying double strand break andlimits its recognition by genome surveillance factors.

FIG. 1: Cas9 interacts stably with substrate DNA. (FIG. 1A) Bio-layerinterferometry measurements of association (left of dotted line) anddissociation (right of dotted line) of Cas9 (black trace) or dCas9(brown trace) with λ1 dsDNA. Mean±standard deviation (SD) kinetic valuescalculated from n=2 experiments are inset. See FIG. 5 for allexperiments and data fitting. (FIG. 1B) Electrophoretic mobility shiftassay (EMSA) measuring dissociation of Cas9 from substrate dsDNA. Cas9RNP was equilibrated with S1 dsDNA for 16 hours, after which unlabeledchallenge dsDNA was added for the indicated time and reaction productswere visualized on a native polyacrylamide gel. Black circle, labeledsubstrate DNA (no Cas9); open triangle, Cas9-DNA complex; region ofinterest, dashed box. Data shown is representative of n=2 experiments.Subsequent figures highlight the region of interest corresponding to thedashed lines.

FIG. 5: (FIG. 5A) Schematic of BLI assay used to measure dissociation.5′ monobiotinylated substrate DNA (identical to λ1, FIG. 2A-2E) isassociated with streptavidin-coated sensor tips (black oval) andbaseline signal is established (left panel). Association phase (rightpanel) loads Cas9 onto substrate dsDNA and measures response.Dissociation phase (not shown) transfers the tip into buffer andmonitors dissociation of Cas9. (FIG. 5B) Fit between BLI data (thicktrace) and calculated kinetic values (maroon trace) for Cas9 (black) anddCas9 (brown). Replicate data is shown. (FIG. 5C) Cas9 interactsspecifically with substrate dsDNA. BLI traces show no interaction ofapoCas9 (no sgRNA) with substrate dsDNA (maroon trace) or Cas9 withsubstrate dsDNA lacking a PAM (blue trace). (n=2). (FIG. 5D) Geldensitometry of FIG. 1B. Mean±SD normalized intensity of + strand (blue)and − strand (red) shifted products were plotted as a function of time.The indicated regression lines were used to calculate k_(off). (FIG. 5E)Cas9, Cas9D10A, and Cas9H840A cleave DNA while dCas9 does not. Cas9nucleases were incubated with or without sgRNA for 30 minutes andassociated with λ1 substrate DNA (FIG. 2A) for ten minutes. Untagged(pCR1002 and pCR1003, FIG. 18) and NLS-tagged (pCR1053-pCR1056, FIG. 18)Cas9 variants were tested and found to have equivalent activity.−Reaction products were resolved on a 10% TBE-Urea gel. Open arrow,uncut substrate DNA; *, excess Cy5 labeled ssDNA; ‡, excess Cy3 labeledssDNA. Data presented is representative of n=2 experiments.

To identify intermediate species that form during Cas9-mediated DNAcleavage, the Cas9-DNA complex was challenged with unlabeled DNA ofvarious compositions (FIG. 2A and FIG. 6). Surprisingly, incubation ofCas9-DNA with an excess of single stranded DNA (ssDNA) identical to thetarget strand caused the fluorophore attached to the 5′ end of thenon-target strand to be lost from the complex, while a fluorophoreattached to the 5′ end of the target strand was not affected by eitherdouble or single stranded challenge DNA (FIG. 2B, lanes 2-4). Bysystematically labeling the 5′ or 3′ termini of either strand of thesubstrate DNA, it was found that target strand challenge DNA removes thePAM-distal non-target strand from the Cas9-DNA complex without affectingthe other three strands (FIG. 7A). This strand-removal activity requiredsequence complementarity between Cas9-bound DNA and challenge DNAoutside the protospacer sequence (FIG. 2B, lanes 8-10), but the PAMsequence of challenge DNA was dispensable (FIG. 2B, lanes 5-7). Theseresults, and the appearance of a new, fluorescently-labeled productwhose size is consistent with cleaved non-target strand annealed tochallenge DNA (FIG. 2B), indicate that single stranded challenge DNAdoes not compete for Cas9 binding, but instead anneals to the non-targetstrand. Strand removal was dependent on the concentration of thischallenge DNA, as well as nuclease activity (FIG. 7B). By recruiting twoRNP complexes to a single target DNA in a PAM-Inward orientation, as isused in paired-nick editing experiments¹⁶, it was found that thePAM-distal non-target strand could be removed from each complex with theappropriate challenge ssDNA (FIG. 2C). Hence, while Cas9 globallydissociates from duplex DNA in a symmetric fashion (FIG. 1B), it appearsthat the enzyme locally releases the PAM-distal non-target strand aftercleavage but before dissociation. This strand is furthermore availablefor annealing to complementary challenge DNA and can be extruded fromthe Cas9-DNA complex by branch migration (FIG. 8).

FIG. 2: Complementary DNA anneals to the non-target strand of theRNP-dsDNA complex. (FIG. 2A) Schematic for EMSA assays. Association: RNPis equilibrated for 10 minutes with fluorescently labeled substrate DNA(magenta, Cy5; green, Cy3) containing protospacer-PAM sequences (blueand green arrows); Challenge/Dissocation: reactions are incubated withor without unlabeled challenge DNA for ten minutes and products areresolved on a native polyacrylamide gel. Nuclease variants (top rightinset) are WT (two magenta arrows representing catalytically activenuclease domains), the Cas9D10A and Cas9H840A nickase variants (singlemagenta arrow), and the catalytically dead D10A/H840A dCas9 variant. DNAsubstrates (middle right inset) contain one or two protospacer-PAMsequences (arrows). Dual binding of RNP to substrate DNA wasinvestigated using sgRNA pairs that targeted protospacer-PAM sites inPAM-In (1 and 2) or PAM-Out (A and B) orientations. Unlabeled challengeDNA (lower right inset) was provided as double-stranded (D) or singlestranded (+ or −) species. Data presented are representative of n=2biological replicates and cropped to highlight the region of interest.The Cas9, sgRNA, and substrate DNA for each EMSA experiment isschematically presented in FIG. 6. Nuclease activity was verified usingdenaturing gels (FIG. 5E). (FIG. 2B) Challenging a stable Cas9-DNAcomplex with ssDNA complementary to the PAM-distal non-target strandleads to removal of this strand from the complex. Challenge DNAs wereidentical to substrate DNA (S1 substrate challenge; lanes 2-4),identical to substrate DNA with PAM disrupted (S1 pam-challenge; lanes5-7), or disrupted the complementarity of the sequence flanking theprotospacer-PAM (S1 nh-challenge; lanes 8-10). Open triangle, RNP-DNAcomplex. (FIG. 2C) Loading multiple Cas9 molecules in a PAM-Inorientation allows displacement of either PAM-distal non-target strand.One or two Cas9 molecules were loaded onto D1 substrate DNA, thenchallenged with the indicated challenge DNA species. Open triangle,RNP-DNA; solid grey triangle, 2×Cas9-DNA product. (FIG. 2D) ChallengeDNA anneals to the uncut non-target strand when Cas9 nuclease domainsare inactivated. EMSA performed as described in FIG. 2A. Cas9, Cas9D10A,Cas9H840A, and dCas9 nuclease variants were used as diagrammed. Opentriangle, RNP-DNA; solid black triangle, supershifted product. (FIG. 2E)Challenge DNA anneals to the non-target strand when strand displacementis prevented by adjacent Cas9-DNA interactions in a PAM-Out orientation.EMSA performed as described in FIG. 2A, except the fluorophore locationwas varied. Cas9 and dCas9 nuclease variants were used as diagrammed D1substrate dsDNA was labeled with Cy5 on the + strand (solid square) orCy3 on the − strand (open square). Challenge ssDNAs were labeled withCy5 on the + strand (solid circle) or Cy3 on the − strand (open circle).Open triangle, RNP-DNA; Solid black arrow, well-shifted products.

FIG. 6: Schematic of reagents and experimental design for EMSAexperiments. Potential supershift products are presented whereappropriate.

FIG. 7: (FIG. 7A) The non-target strand is released on the PAM-distalside of the cut. One Cas9 molecule was loaded onto substrate DNAfluorescently labeled at the 5′ or the 3′ terminus of each strand (FIG.6). Only the 5′ non-target strand can be removed from the complex by achallenge DNA. Open arrow, RNP-DNA complex. (FIG. 7B) Removal of thenon-target strand depends upon the concentration of the challenge DNAbut is independent of the labeling fluorophore. Single RNP EMSA wasconducted as described in FIG. 2A, except challenge concentration wasvaried from 0-1500 nM (0, 30, 75, 150, 300, 600, 1500 nM). Catalyticallyinactive dCas9 was used in lane 8(dashed box) to demonstrate thatnuclease activity is required for strand extrusion activity. SubstrateDNA fluorescently labeled at the 5′ termini with Cy5 or Cy3 asindicated. Open arrow, RNP-DNA complex. (FIG. 7C) Strand annealingoccurs in single-RNP substrates when the non-target strand is leftintact. Cas9 or dCas9 variants were loaded onto substrate DNA asindicated and as described in FIG. 6. Challenge concentration was variedfrom 0-5 uM (0, 500, 1500, 2500, 5000 nM). Open arrow, RNP-DNA complex;solid black arrow, supershifted products.

FIG. 8: Model for challenge-mediated non-target strand removalactivity. 1) After duplex cleavage, Cas9 holds onto three ends of thetarget DNA (white crossed circles), but the PAM-distal non-target strandis released from the Cas9-DNA complex. 2) Complementary DNA anneals toreleased strand. 3) Branch migration results in extrusion from theCas9-DNA complex.

To determine whether Cas9 asymmetrically releases DNA during in vivogene targeting in human cells, a terminal transferase end-labeling assaywas used to measure the accessibility of the 3′-hydroxyls on either sideof a Cas9-induced double stranded break immediately after cleavage. Itwas found that the non-target strand is PAM-distally end-labeledsix-fold more effectively than the target strand, and this preferentiallabeling follows the orientation of the Cas9 complex, such that sgRNAstargeting opposite strands induce labeling of opposite hydroxyls (FIG.9). Preferential accessibility of a single strand at the site of a Cas9break raises the possibility that single strand break recognition mayplay an unanticipated role in Cas9-mediated genome editing.

FIG. 9: The non-target strand is available for enzymatic modification incells. Cas9 was targeted to either strand of the AAVS1 locus (AAVS1-F orAAVS1-R) and terminal transferase was introduced to 3′ end-label cut DNAwith biotin. After streptavidin immunoprecipitation, end-labeling oneither side of the break was determined by the ability to qPCR amplifysequences using the indicated primer pairs (Left and Right). Results arepresented as the mean+/−SD fold enrichment (n=3) of labeled DNA overuncut control DNA (ACT1).

Having shown that Cas9 releases the PAM-distal non-target strand buttightly engages the other three strands, the nature of complexes formedwhen strand extrusion is prevented was explored, either by catalyticinactivation of nuclease domains or by topological prevention of branchmigration. Combining substrate DNA with Cas9 or the Cas9H840A mutant,which both cut the non-target strand, preserved the challenge-dependentremoval of the non-target strand. The Cas9D10A mutant and dCas9, whichleave the non-target strand intact, instead exhibited a supershiftedproduct when provided with target strand challenge. This is consistentwith stable annealing of the ssDNA challenge to the non-target strandand formation of a Cas9-dsDNA-ssDNA complex (FIG. 2D and FIG. 7C). Thismodel predicts that loading two RNPs onto a single substrate DNA in aPAM-Out orientation should prevent strand removal by branch migrationbecause the stable protein-DNA interaction on the PAM-proximal side ofeach RNP would topologically block branch migration from the othercomplex (FIG. 6). This may be similar to the situation encounteredduring genomic targeting with even a single Cas9 nuclease, in whichchromatin factors such as nucleosomes should prevent branch migration.Strand-annealing activity in PAM-Out complexes to directly the monitorincorporation of fluorescently labeled challenge DNA into supershiftedproducts was investigated. For PAM-Out RNPs bound to either strand ofthe substrate, addition of target strand challenge DNA resulted inretention of the fluorescent challenge in the wells rather than strandremoval (FIG. 2E). This is in direct contrast to the PAM-Inconfiguration, which does not present a topological barrier and allowsstrand removal (FIG. 2C). Taken together, these observations demonstratethat the non-target strand is accessible for annealing to complementaryssDNA even when branch migration is prevented. Cas9 binding thereforenot only melts the non-target DNA strand from the target strand, butalso renders the non-target strand accessible for annealing to exogenousnucleic acid.

Short ssDNA donors containing a mutation of interest have been used tostimulate homology-directed repair (HDR) events, the frequency of whichcan be increased by administering cell cycle or DNA damage repairinhibitors^(5,14). It was investigated if designing an ssDNA donor tooptimize annealing to the exposed non-target strand could boost thefrequency of HDR events in the absence of chemical intervention. Toexplore this hypothesis, ssDNA donor molecules with varying sequenceoverlap on the 5′ and 3′ side of the break and complementary to eitherthe non-target or target strand were generated, and their ability tosupport Cas9-mediated conversion of a stably-integrated BFP reporter togreen fluorescent protein (GFP) via a three nucleotide mutation wasmeasured (FIG. 3A-FIG. 3B). Nucleofection of Cas9 RNPs with a donor DNAcomplementary to the non-target strand stimulated HDR frequencies up to2.6 fold greater than donor DNA complementary to the target strand and 4fold greater than double strand donor DNA of the same length (FIG. 3C).Strikingly, asymmetric donor DNA optimized for annealing by overlappingthe Cas9 cut site with 36 base pairs on the PAM-distal side, and with a91 base pair extension on the PAM-proximal side of the break, supportedHDR frequencies of 57±5%. This HDR frequency, obtained through simplerules of ssDNA donor design, is several fold greater than rates obtainedusing potentially undesirable chemical or genetic intervention, such ascell cycle blockade or knockdown of non-homologous end joining (NHEJ)repair^(3,6). Shorter or longer overlaps with the non-target strandcompromised editing efficiency, possibly by reducing the stability ofannealing to the non-target strand or requiring extensive invasion intothe duplex region further away from the Cas9 complex. Donor DNAcomplementary to the non-target strand designed using these rules alsoincreased HDR frequencies at the endogenous EMX1 locus (FIG. 11 and FIG.12). Notably, while the geometric design principles used for increasedHDR at EMX1 remained consistent with annealing to the non-target strand,the strandedness and polarity of the targeting sgRNA and hence the donorssDNA are opposite those used when editing the exogenous BFP construct.Thus, HDR enhancement by precise donor-non-target strand complementarityappears robust to the choice of transcript template or coding strand.

FIG. 3: Delivery of ssDNA donors complementary to the non-target stranddrives efficient HDR using Cas9, nickases, and dCas9. (FIG. 3A)Schematic for HDR at a BFP reporter locus. Target strand (green) ornon-target strand (magenta) donor ssDNAs were generated with theindicated overlaps on either side of the Cas9 cut site in the BFPreporter. The sequences of the unedited (wild type (WT), BFP) and editedloci (HDR, GFP) are presented inset (PAM reverse-complement, underlined;cut site, magenta arrow). (FIG. 3B) HDR, NHEJ, and unedited populationscan be measured using flow cytometry. BFP-GFP flow cytometry scatterplots for BFP reporter cells (leftmost panel), BFP reporter cells editedwith Cas9 (Cas9), or BFP reporter cells edited with the indicatednuclease and Donor Ht. Data shown is representative of n=2 experiments.Gated populations are WT, BFP+ cells; NHEJ, BFP− GFP− cells; and HDR,GFP+ cells. (FIG. 3C) Optimized donor DNA is complementary to thenon-target strand and has a characteristic size. HDR frequencies forediting with target (t), non-target (n), or double-stranded (d) donorDNAs are presented at right as mean±SD for n>2 independent experiments.(FIG. 3D) Target strand donor stimulates greater levels of HDR for allCas9 variants. HDR frequencies quantified from editing experiments usingthe indicated nuclease and donor Donor Ht (target strand, green) orDonor Hn (non-target strand, magenta). Data is presented as mean±SD fromn>2 independent experiments. (FIG. 3E) Single or tiled-dCas9 moleculessupport HDR. HDR frequencies from dCas9 editing experiments as presentedin FIG. 3D, except control (RNP only, Donor only) reactions are shownalongside editing reactions. RNP, single dCas9; tiled, equimolar amountsof dCas9 targeting four distinct sites on the coding strand of the BFPreporter.

FIG. 11: Strand-bias for optimized donor DNA is independent of genomiclocus and gene transcription. Cas9 targets the template strand of theEMX1 locus as diagrammed at left. Target strand (blue) or non-targetstrand (orange) donor ssDNAs were generated with the indicated overlapson either side of the Cas9 cut site at EMX1. The sequences of theunedited and edited loci are presented inset (PAM sequence, underlined;cut site, magenta arrow; PciI site, bold font). HDR frequencies forediting with each donor are presented at right as mean+/−SD for n>2 twoindependent experiments.

FIG. 12: (FIG. 12A) The EMX1 locus is not cut as efficiently as the BFPlocus. PCR amplification and T7E1 digestion were performed on cellsedited using the indicated donor DNA (N/A—no donor, N/C—no Cas9). % Cutwas quantified by gel densitometry. Compare to 95% total editing seen atthe BFP locus (FIG. 10A). (FIG. 12B) HDR incorporation of a PciI siteinto the EMX1 locus shows donor strand-bias. PCR amplification (−) orPCR amplification and PciI digestion (+) was performed on cells editedusing the indicated donor DNA (N/A—no donor). % Cut was quantified bygel densitometry and used to generate bar graphs in FIG. 11.

Because it was observed that in vitro strand annealing is independent ofnuclease activity (FIG. 2), the potential of optimized ssDNA donors toedit the BFP reporter when paired with Cas9 variants in which one orboth nuclease domains were disrupted was investigated. Cas9D10Astimulates 2-3% HDR when delivered via plasmid in a “paired nick”configuration^(16,17) but efficient HDR with a single nickase has notbeen reported¹⁸. Using RNP electroporation, it was observed thatCas9D10A (nicking the target strand) and Cas9H840A (nicking thenon-target strand) each stimulated HDR frequencies of ˜10% when providedwith target strand donor DNA, but also silenced the BFP reporter,potentially by inducing error-prone NHEJ (FIG. 3B, FIG. 3D and FIG.10A). This latter observation raises concerns about the use of pairednickases for editing, since off-target cuts associated with each singlenickase could be mutagenic. The high efficiency of editing with RNPdelivery relative to plasmid delivery may reveal these unappreciatedNHEJ events, since indels caused by individual plasmid-expressednickases have previously been reported at the low end of the detectionlevel for T7E1 or Surveyor assays^(8,16,17). Surprisingly, a small butmeasurable (0.4±0.0%) frequency of HDR was observed when catalyticallyinactive dCas9 was used in editing experiments (FIG. 3B and FIG. 3D).dCas9 was less effective at stimulating HDR than Cas9, Cas9D10A, orCas9H840A but presumably stimulates editing without introducing breaksin genomic DNA (FIG. 10A). For all nuclease variants tested, HDRoccurred at approximately a two-fold higher frequency when donor DNAcomplementary to the non-target strand was provided relative to donorDNA complementary to the target strand (FIG. 3D), suggesting that strandannealing impacts HDR in the absence of nuclease activity, consistentwith the ability to stably form Cas9-dsDNA-ssDNA complexes in vitro(FIG. 2D, FIG. 2E and FIG. 7C).

FIG. 10: (FIG. 10A) Representative flow cytometry data used to createbar graphs shown in FIG. 3B. (FIG. 10B) Representative flow cytometrydata used to create bar graphs shown in FIG. 3C.

dCas9-mediated HDR is reminiscent of reports of ODN-mediated repair¹⁹,in which single stranded donor DNA invades uncut genomic duplex andprovides a template for conversion. As the annealing step is thought tobe rate limiting in ODN-mediated repair, it was speculated that bindingmultiple dCas9 molecules to the same strand would displace largeportions of genomic DNA for annealing to the ssDNA donor and mightincrease the frequency of dCas9-mediated HDR. Tiling four dCas9molecules on the same strand resulted in 2-fold increase (to 0.7±0.3%)in HDR when paired with an ssDNA donor that can anneal to the non-targetstrand, but a 2-fold decrease (0.2±0.04%) when paired with anon-annealing donor (FIG. 3E and FIG. 10B). The rate of dCas9-mediatedHDR, while low compared to wild type Cas9-mediated mutation, issubstantially greater than oligonucleotide alone (0.02±0.03%; FIG. 3E).dCas9-mediated mutation could be useful in therapeutic applicationswhere cleavage at target or off-target sites stimulates undesirable NHEJevents.

Cas9 is an antiviral restriction enzyme that has rapidly been adopted asa tool for gene editing, but relatively little is known about the stepsthat occur between genome cleavage and subsequent repair. It wasdemonstrated that although Cas9 binds stably to DNA substrates, it makesone strand upstream of the PAM and identical in sequence to the RNAprotospacer accessible both in vitro and in vivo. Cas9-mediatedinterrogation of potential Clustered Regularly Interspaced ShortPalindromic Repeat (CRISPR) targets has recently been shown to play arole in protospacer acquisition by Cas1, Cas2, and Csn2²⁰, and it istempting to speculate that Cas9's ability to render the non-targetstrand available for annealing to exogenous factors could potentiatethis process. The ability of Cas9 to release one PAM-distal strand ofDNA is consistent with recently published structural data showing thatthe target DNA strand is buried from solvent and wrapped around thesgRNA, whereas the RuvC active site lines a wide, solvent-exposed basiccleft that appears poised to channel the PAM-distal region of thenon-target strand (FIG. 13). This structural asymmetry also explains anearly observation that Cas9 cleaves the target strand in one preciselocation 3 base pairs from the PAM while the non-target strand is cut invariable locations¹⁶, likely because the strand is free to breathe inand out of the nuclease domain Collectively, the biochemical andstructural asymmetry of Cas9 interaction with substrate DNA indicatethat Cas9 is not conceptually equivalent to other targeted nucleasessuch as zing finger nucleases (ZFNs) or transcription activator-likeeffector nucleases (TALENS), which have symmetric FokI catalytic sitesand have not been observed to preferentially release one strand of DNA.Thus, strategies for Cas9-mediated editing may differ from those usedwith other gene-editing tools, and may even differ for engineereddCas9-FokI editing²¹.

FIG. 13: Structural data is consistent with asymmetric release ofsubstrate by Cas9. A surface electrostatic view of Cas9, sgRNA (orange),and non-target (purple) or target (grey) DNA strands⁷. PAM-Cas9interaction, white arrow; putative path of non-target strand, purpledots; presumed direction of non-target strand extrusion, black arrow.

Most optimization of genome editing has focused on biasingnuclease-based editing towards HDR and away from NHEJ by chemically orgenetically inactivating components of the NHEJ pathway, chemicallyactivating HDR, or manipulating the cell cycle³⁻⁶. These transinterventions, while highly useful in some contexts, may be undesirableduring therapeutic gene editing because they diminish the cellularcapacity to respond to damage at other sites in the genome. It was foundthat Cas9-mediated HDR frequencies can be increased by rationallydesigning the orientation, polarity, and length of the donor ssDNA tomatch the properties of the Cas9-DNA complex. It was also found thatthese donor designs paired with tiled catalytically-inactive dCas9molecules could stimulate HDR approximately 50-fold greater than donoralone. It is currently unclear whether the enhancement of HDR witheither Cas9 or dCas9 occurs via a direct mechanism (e g mimicking aspecific HDR intermediate structure that is recognized by the cell), orindirectly (e.g. increasing the local concentration of the repairtemplate). Simple strategies discovered here will be valuable for basicresearch and therapeutic gene editing applications, for examplecorrecting a disease causing allele to the wild type sequence.

REFERENCES

-   1 Doudna, J. A. & Charpentier, E. Genome editing. The new frontier    of genome engineering with CRISPR-Cas9. Science (New York, N.Y.)    346, 1258096, doi:10.1126/science.1258096 (2014).-   2 Jiang, W. & Marraffini, L. A. CRISPR-Cas: New Tools for Genetic    Manipulations from Bacterial Immunity Systems. Annual review of    microbiology, doi:10.1146/annurev-micro-091014-104441 (2015).-   3 Chu, V. T. et al. Increasing the efficiency of homology-directed    repair for CRISPR-Cas9-induced precise gene editing in mammalian    cells. Nature Biotechnology, doi:10.1038/nbt.3198 (2015).-   4 Davis, L. & Maizels, N. Homology-directed repair of DNA nicks via    pathways distinct from canonical double-strand break repair.    Proceedings of the National Academy of Sciences of the United States    of America 111, E924-932, doi:10.1073/pnas.1400236111 (2014).-   5 Lin, S., Staahl, B. T., Alla, R. K. & Doudna, J. A. Enhanced    homology-directed human genome engineering by controlled timing of    CRISPR/Cas9 delivery. eLife 4, doi:10.7554/eLife.04766 (2014).-   6 Maruyama, T. et al. Increasing the efficiency of precise genome    editing with CRISPR-Cas9 by inhibition of nonhomologous end joining.    Nature Biotechnology, doi:10.1038/nbt.3190 (2015).-   7 Anders, C., Niewoehner, O., Duerst, A. & Jinek, M. Structural    basis of PAM-dependent target DNA recognition by the Cas9    endonuclease. Nature 513, 569-573, doi:10.1038/nature13579 (2014).-   8 Nishimasu, H. et al. Crystal structure of Cas9 in complex with    guide RNA and target DNA. Cell 156, 935-949,    doi:10.1016/j.cell.2014.02.001 (2014).-   9 Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease    in adaptive bacterial immunity. Science (New York, N.Y.) 337,    816-821, doi:10.1126/science.1225829 (2012).-   10 Carroll, D. Genome engineering with zinc-finger nucleases.    Genetics 188, 773-782, doi:10.1534/genetics.111.131433 (2011).-   11 Gaj, T., Gersbach, C. A. & Barbas, C. F. ZFN, TALEN, and    CRISPR/Cas-based methods for genome engineering. Trends in    biotechnology 31, 397-405, doi:10.1016/j.tibtech.2013.04.004 (2013).-   12 Sternberg, S., Redding, S., Jinek, M., Greene, E. & Doudna, J.    DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature    (2014).-   13 Abdiche, Y., Malashock, D., Pinkerton, A. & Pons, J. Determining    kinetics and affinities of protein interactions using a parallel    real-time label-free biosensor, the Octet. Analytical biochemistry    377, 209-217, doi:10.1016/j.ab.2008.03.035 (2008).-   14 Kim, S , Kim, D , Cho, S. W., Kim, J & Kim, J.-S. Highly    efficient RNA-guided genome editing in human cells via delivery of    purified Cas9 ribonucleoproteins. Genome Research 24, 1012-1019,    doi:10.1101/gr.171322.113 (2014).-   15 Metzger, L. & Iliakis, G. Kinetics of DNA double-strand break    repair throughout the cell cycle as assayed by pulsed field gel    electrophoresis in CHO cells. International journal of radiation    biology 59, 1325-1339 (1991).-   16 Ran, F. A. et al. Double nicking by RNA-guided CRISPR Cas9 for    enhanced genome editing specificity. Cell 154, 1380-1389,    doi:10.1016/j.cell.2013.08.021 (2013).-   17 Mali, P. et al. CAS9 transcriptional activators for target    specificity screening and paired nickases for cooperative genome    engineering. Nature Biotechnology 31, 833-838, doi:10.1038/nbt.2675    (2013).-   18 Trevino, A. E. & Zhang, F. Genome Editing Using Cas9 Nickases. 1    edn, Vol. 546 (Elsevier Inc., 2014).-   19 Engstrom, J. U., Suzuki, T. & Kmiec, E. B. Regulation of targeted    gene repair by intrinsic cellular processes. BioEssays: news and    reviews in molecular, cellular and developmental biology 31,    159-168, doi:10.1002/bies.200800119 (2009).-   20 Heler, R. et al. Cas9 specifies functional viral targets during    CRISPR-Cas adaptation. Nature 519, 199-202, doi:10.1038/nature14245    (2015).-   21 Tsai, S. Q. et al. Dimeric CRISPR RNA-guided FokI nucleases for    highly specific genome editing. Nature Biotechnology 32, 569-576,    doi:10.1038/nbt.2908 (2014).-   22 Anders, C. & Jinek, M. In vitro enzymology of Cas9. Methods in    enzymology 546, 1-20, doi:10.1016/B978-0-12-801185-0.00001-5 (2014).-   23 Aparicio, O. et al. Chromatin immunoprecipitation for determining    the association of proteins with specific genomic sequences in vivo.    Current protocols in molecular biology/edited by Frederick M Ausubel    [et al] Chapter 21, Unit 21.23, doi:10.1002/0471142727.mb2103s69    (2005).

While the present invention has been described with reference to thespecific embodiments thereof, it should be understood by those skilledin the art that various changes may be made and equivalents may besubstituted without departing from the true spirit and scope of theinvention. In addition, many modifications may be made to adapt aparticular situation, material, composition of matter, process, processstep or steps, to the objective, spirit and scope of the presentinvention. All such modifications are intended to be within the scope ofthe claims appended hereto.

What is claimed is:
 1. A method of editing genomic DNA of a eukaryoticcell, wherein the genomic DNA comprises a target strand and a non-targetstrand, the method comprising introducing into the cell: (a) a Cas9guide RNA, or one or more nucleic acids encoding said Cas9 guide RNA,wherein the Cas9 guide RNA hybridizes to a target sequence of the targetstrand of the genomic DNA; (b) an asymmetric double stranded or singlestranded donor DNA molecule comprising a 5′ homology arm and a 3′homology arm, wherein the 3′ homology arm is 20 to 50 nucleotides inlength, is shorter than the 5′ homology arm, and comprises at least 10consecutive nucleotides of said target sequence; and (c) a Cas9 proteinor a nucleic acid encoding said Cas9 protein, wherein (i) the Cas9protein forms a complex with the Cas9 guide RNA thereby guiding the Cas9protein to said target sequence, (ii) the 3′ homology arm of the donorDNA molecule hybridizes to the non-target strand of the genomic DNA, and(iii) a nucleotide sequence of the donor DNA molecule is incorporatedinto the genomic DNA.
 2. The method according to claim 1, wherein theCas9 protein comprises a functional RuvC domain and cleaves at least thenon-target strand of genomic DNA.
 3. The method according to claim 1 or2, wherein the Cas9 protein comprises a functional HNH domain andcleaves at least the target strand of genomic DNA.
 4. The methodaccording to any of claims 1-3, wherein the donor DNA molecule is singlestranded.
 5. The method according to any of claims 1-4, wherein the 5′homology arm of the donor DNA molecule is 40 to 200 nucleotides inlength.
 6. The method according to any of claims 1-5, wherein the donorDNA molecule comprises a heterologous nucleotide sequence, between the5′ and 3′ homology arms, that is incorporated into the genomic DNA. 7.The method according to any of claims 1-6, wherein the donor DNAmolecule comprises one or more synthetic modifications selected from: abase modification, a sugar modification, and a backbone modification. 8.The method according to claim 7, wherein the donor DNA moleculecomprises a phosphorothioate linkage.
 9. A system for editing genomicDNA in a eukaryotic cell, the system comprising: (a) a Cas9 guide RNA,or one or more nucleic acids encoding said Cas9 guide RNA, wherein theCas9 guide RNA comprises a guide sequence that is complementary to atarget sequence of a target strand of genomic DNA of a eukaryotic cell;and (b) an asymmetric double stranded or single stranded donor DNAmolecule comprising a 5′ homology arm and a 3′ homology arm, wherein the3′ homology arm is 20 to 50 nucleotides in length, is shorter than the5′ homology arm, and comprises at least 10 consecutive nucleotides ofsaid target sequence.
 10. The system of claim 7, further comprising aCas9 protein or a nucleic acid encoding said Cas9 protein.
 11. Thesystem of claim 10, wherein the Cas9 protein comprises a functional RuvCdomain
 12. The system of claim 10 or 11, wherein the Cas9 proteincomprises a functional HNH domain.
 13. The system of any of claims 9-12,wherein the donor DNA molecule is single stranded.
 14. The system of anyof claims 9-13, wherein the 5′ homology arm of the donor DNA molecule is40 to 200 nucleotides in length.
 15. The system of any of claims 9-14,wherein the donor DNA molecule comprises nucleotide sequence, betweenthe 5′ and 3′ homology arms, that is heterologous to said genomic DNA.16. The system of any of claims 9-15, wherein the donor DNA moleculecomprises one or more synthetic modifications selected from: a basemodification, a sugar modification, and a backbone modification.
 17. Thesystem of claim 16, wherein the donor DNA molecule comprises aphosphorothioate linkage.
 18. A method of editing a target genomic DNAof a eukaryotic cell, the method comprising introducing into theeukaryotic cell: (a) a dead Cas9 (dCas9) protein, or a nucleic acidencoding said dCas9 protein, wherein the dCas9 protein does not cleavethe target genomic DNA; (b) a Cas9 guide RNA, or one or more nucleicacids encoding said Cas9 guide RNA, wherein the Cas9 guide RNAhybridizes to a target sequence of the target genomic DNA; and (c) acorresponding double stranded or single stranded donor DNA templatemolecule comprising at least 10 consecutive nucleotides of said targetsequence, wherein the dCas9 protein forms a complex with the Cas9 guideRNA thereby guiding the dCas9 protein to said target sequence, andwherein a nucleotide sequence of the donor DNA molecule is incorporatedinto the genomic DNA.
 19. The method according to claim 18, wherein thedonor DNA template molecule is an asymmetric donor DNA moleculecomprising a 5′ homology arm and a 3′ homology arm, wherein the 3′homology arm is 20 to 50 nucleotides in length, is shorter than the 5′homology arm, and comprises at least 10 consecutive nucleotides of saidtarget sequence.
 20. The method according to claim 19, wherein the 5′homology arm is 40 to 200 nucleotides in length.
 21. The methodaccording to any of claims 18-20, wherein the donor DNA templatemolecule is single stranded.
 22. The method according to any of claims18-21, wherein the donor DNA template molecule comprises a heterologousnucleotide sequence that is incorporated into the genomic DNA.
 23. Themethod according to any of claims 18-22, wherein said introducingcomprises introducing into the cell two or more Cas9 guide RNAs, or oneor more nucleic acids encoding said two or more Cas9 guide RNAs, whereinthe two or more Cas9 guide RNAs hybridize to target sequences that donot overlap with one another and are each separated from one another by1-100 nucleotides.
 24. The method according to claim 23, wherein the twoor more Cas9 guide RNAs hybridize to target sequences that overlap withthe donor DNA template molecule.
 25. The method according to claim 23,wherein said introducing comprises introducing into the cell three ormore Cas9 guide RNAs, or one or more nucleic acids encoding said threeor more Cas9 guide RNAs, wherein the three or more Cas9 guide RNAshybridize to target sequences that do not overlap with one another andare each separated from one another by 1-100 nucleotides.
 26. The methodaccording to claim 25, wherein the three or more Cas9 guide RNAshybridize to target sequences that overlap with the donor DNA templatemolecule.
 27. The method according to claim 25, wherein said introducingcomprises introducing into the cell four or more Cas9 guide RNAs, or oneor more nucleic acids encoding said four or more Cas9 guide RNAs,wherein the four or more Cas9 guide RNAs hybridize to target sequencesthat do not overlap with one another and are each separated from oneanother by 1-100 nucleotides.
 28. The method according to claim 27,wherein the four or more Cas9 guide RNAs hybridize to target sequencesthat overlap with the donor DNA template molecule.
 29. The methodaccording to any of claims 18-28, wherein the donor DNA moleculecomprises one or more synthetic modifications selected from: a basemodification, a sugar modification, and a backbone modification.
 30. Themethod according to claim 29, wherein the donor DNA molecule comprises aphosphorothioate linkage.
 31. A system for editing genomic DNA in aeukaryotic cell, the system comprising: (a) a dead Cas9 (dCas9) protein,or a nucleic acid encoding said dCas9 protein, wherein the dCas9 proteinlacks catalytically active RuvC and HNH domains; (b) a Cas9 guide RNA,or one or more nucleic acids encoding said Cas9 guide RNA, wherein theCas9 guide RNA comprises a guide sequence that is complementary to atarget sequence of a target genomic DNA of a eukaryotic cell; and (c) acorresponding double stranded or single stranded donor DNA templatemolecule comprising at least 10 consecutive nucleotides of said targetsequence.
 32. The system of claim 31, wherein the donor DNA templatemolecule is an asymmetric donor DNA molecule comprising a 5′ homologyarm and a 3′ homology arm, wherein the 3′ homology arm is 20 to 50nucleotides in length, is shorter than the 5′ homology arm, andcomprises the at least 10 consecutive nucleotides of said targetsequence.
 33. The system of claim 32, wherein the 5′ homology arm of thedonor DNA molecule is 40 to 200 nucleotides in length.
 34. The system ofany of claims 31-33, wherein the donor DNA template molecule is singlestranded.
 35. The system of any of claims 31-34, wherein the donor DNAtemplate molecule comprises a nucleotide sequence that is a heterologousto said target genomic DNA.
 36. The system of any of claims 31-35,wherein the system comprises two or more Cas9 guide RNAs, or one or morenucleic acids encoding said two or more Cas9 guide RNAs, wherein theguide sequences of the two or more Cas9 guide RNAs are complementary totarget sequences that do not overlap with one another and are eachseparated from one another by 1-100 nucleotides.
 37. The system of claim36, wherein the two or more Cas9 guide RNAs are complementary to targetsequences that overlap with the donor DNA template molecule.
 38. Thesystem of claim 36, wherein the system comprises three or more Cas9guide RNAs, or one or more nucleic acids encoding said three or moreCas9 guide RNAs, wherein the guide sequences of the three or more Cas9guide RNAs are complementary to target sequences that do not overlapwith one another and are each separated from one another by 1-100nucleotides.
 39. The system of claim 38, wherein the three or more Cas9guide RNAs are complementary to target sequences that overlap with thedonor DNA template molecule.
 40. The system of claim 38, wherein thesystem comprises four or more Cas9 guide RNAs, or one or more nucleicacids encoding said four or more Cas9 guide RNAs, wherein the guidesequences of the four or more Cas9 guide RNAs are complementary totarget sequences that do not overlap with one another and are eachseparated from one another by 1-100 nucleotides.
 41. The system of claim40, wherein the four or more Cas9 guide RNAs are complementary to targetsequences that overlap with the donor DNA template molecule.
 42. Thesystem of any of claims 31-41, wherein the donor DNA molecule comprisesone or more synthetic modifications selected from: a base modification,a sugar modification, and a backbone modification.
 43. The system ofclaim 42, wherein the donor DNA molecule comprises a phosphorothioatelinkage.
 44. The system of any of claims 31-43, wherein the systemcomprises a eukaryotic cell comprising said target genomic DNA.